Mind-Let Loose: October 2010

Sunday, October 31, 2010

HTTP

HTTP stands for Hyper-Text Transfer Protocol; This was one of the topics I was asked at my job interview.

So why do we need HTTP? Simply put, the whole globe is one networked village today through the usage of the World Wide Web and the Internet and HTTP is the foundation of data communication for this World Wide Web.
Who developed HTTP? Development of HTTP standards has been coordinated by the Internet Engineering Task Force(IETF) and the World Wide Web Consortium.

Next, let us see how this protocol actually works.

HTTP functions as a request-response protocol in the client-server computing model. In a client-server model, a web browser acts as a client and an application running on a computer hosting a web site functions as a server. The client or the web browser submits an HTTP request message to the server. The server returns a response message to the client. This response contains completion status information about the request and may contain any content requested by the client in its message body.

HTTP is an Application Layer protocol designed within the framework of the 7 layer-Internet Protocol Suite.

How does HTTP identifies and locates the resources it needs? It is done on the network by Uniform Resource Identifiers (URIs) or Uniform Resource Locators (URLs) using the http or https URI schemes.

The original version of HTTP was HTTP1.0 and was revised in HTTP1.1. The earlier version of HTTP used a separate connection to the same server for every request-response transaction. HTTP1.1 can reuse a connection multiple times. This is called a persistent connection.

Such persistent connections reduce the delays in communication, because the client does not need to re-negotiate the TCP connection after the first request has been sent.

An HTTP session is obviously a sequence of network request-response transactions.

In addition, HTTP is a stateless protocol. Thus it does not require the server to retain information or status about each user for the duration of multiple requests.

Clustering

Well, this is a concept which I was reminded of recently at a presentation, since I learnt it as a subject at the university. It was a bit of a confusing topic to me those days, especially with all the algorithms involved. However, I guess I have a better knowledge of it at the moment to explain the concept in a simple way which most people could grasp.

Clustering is used to group objects in a way by which the group members are similar to each other. These objects can be anything related to data.

So why should data be clustered like this? It's simple. This concept helps to analyze important patterns from seemingly unimportant data; Such patterns are invaluable to organizations such as business firms, research firms etc.

Marketing: finding groups of customers with similar behavior when given a large database of customer data containing their properties and past buying records
Biology: classification of plants and animals once given their features
Libraries: book ordering
Insurance: identifying groups of motor insurance policy holders with a high average claim cost and also identifying frauds
WWW: document classification and clustering weblog data to discover groups of similar access patterns

There are 4 TYPES of clustering algorithm classifications.

Exclusive Clustering - In this method, data are grouped in an exclusive way; If a certain datum belongs to a definite cluster then it could not be included in another cluster
Overlapping Clustering - This uses fuzzy sets to cluster data; Each point may belong to two or more clusters with different degrees of membership
Hierarchical Clustering - Is based on the union between the two nearest clusters. Here, the beginning condition for clustering is realized by setting every datum as a cluster. After a few iterations, it reaches the final clusters wanted
Probabilistic Clustering - It uses a completely probabilistic approach

The 4 most used clustering algorithms are as follows:

K-means
Fuzzy C-means
Hierarchical clustering
Mixture of Gaussians

K-means is an exclusive clustering algorithms whereas Fuzzy C-means is an overlapping clustering algorithm. Hierarchical clustering obviously belongs to the Hierarchical clustering algorithm. Mixture of Gaussians is classified under probabilistic clustering

Saturday, October 30, 2010

Mobile Rewards

I guess almost every person who's used a mobile phone is familiar with one or more reward schemes offered by the mobile Operator. From world renowned Operators like AT&T to operators that are not that famous, each company have focused on their own reward programs. So what are these Mobile Reward Programs and why are these companies so interested in them?

For one thing, it is essential for Mobile Service Providers to retain their customer base in this competitive economy. Such programs let them keep their customers attracted to the service. For example, the customers will be rewarded based on their usage of services; Those who frequently take calls, the people who use the IDD facility often, frequent Internet users etc., will be provided with special discount schemes tailored to them. In addition, attractive point schemes where the users have the opportunity of redeeming them at places like supermarkets, movie ticket stalls etc., have gained immense popularity. For example, in such situations, the customers can provide their mobile number at the sales counter at it will be used as the unique identification to redeem the points owned by the customer to let him/her to pay for the goods bought. This is a very attractive scheme actually. It eliminates the hassle of obtaining or carrying loyalty vouchers.

Secondly, providing of such schemes would let the service provides gain competitive edge and attract new customers who are always searching for new features and facilities; It is no big deal nowadays to switch your mobile service provider in an instant. Thus, more and more effort is being put into creating reward schemes for users and keep their loyalty to the Mobile Operator.

Monday, October 25, 2010

Data Replication vs Data Mirroring

Just today morning I had the chance to participate for a presentation on database replication. It seemed to me like a good topic to blog about today, until I came across the Data Mirroring technique. Even though both these techniques are almost similar to the unobservant mind and are frequently used interchangeably, a subtle yet strong difference makes them apart conceptually.

Data replication and mirroring are both transparent ways of making the database server more tolerant to faults. Such methods are essential to ensure the consistency of business data. However, as the following figure, shows, how these two methods are quite different from their ways of preserving data.

In Mirroring, a single database server maintains a copy of a specific dbspace on a separate disk. Such a remote mirrored data volume comprises of two identical copies of the data connected by Fibre Channel. Both sides of a mirror process read and write I/Os (Inputs/Outputs) to ensure that each copy is a real-time duplicate of the other. The mirror copies are kept in separate data centers, connected over a local- or metro-area network (LAN or MAN).This mechanism protects the data in those mirrored dbspaces against disk failure because the database server automatically updates data on both the disks and automatically uses the other disk if one of the dbspaces fails.

In Data replication,it duplicates all the data that a database server manages on an entirely separate database server (not just specified dbspaces). As this method involves two separate database servers, it protects the data not just against disk failures, but also against all types of database server failures, including a computer failure or the catastrophic failure of an entire site. The source and target servers used for the process of replication are usually separated by a significant distance to safeguard data from disasters that effect a specific geographic location, such as a region-wide power outage.

Two principle modes of replication techniques:

Asynchronous replication - the primary and secondary data volumes are no more than a few milliseconds out of sync, so the replication is nearly real-time.
Synchronous replication - the primary and secondary copies are always identical, so it provides a true real-time duplication

Hence, even though like replication, remote mirroring also uses redundancy to guarantee data availability, the techniques should not be confused with each other.

Sunday, October 24, 2010

USSD

Since I started my career as a technical writer for the Business Analyst team, I have been coming across a lot of USSD based work. Thus, I suppose it's only fair of me to write a blog entry on USSD.

What does the acronym USSD mean? USSD actually stands for Unstructured Supplementary Service Data which is a technology unique to GSM. It is a capability which is built into the GSM standard and is used to support transmitting information over the signaling channels of the GSM network. Frankly, it is used to communicate with the service provider's computers. USSD provides session-based communication which in turn enables a variety of applications. Unlike SMS, USSD provides a real real-time connection in its session, though. This connection remains open, allowing a two-way exchange of a sequence of data which makes USSD more responsive than services that use SMS

Some of the uses of USSD can be listed as follows:

WAP browsing,
Prepaid callback service(e.g. cheaper phone charges while roaming),
Location-based content services,
Menu-based information services (Stock quotes, Sports results),
and as part of configuring the phone on the network

Users do not need to access any particular phone menu to enjoy the services with USSD; They can actually enter the USSD command direct from the initial mobile phone screen. This is very fast than SMS as it does not involve the store-and-forward technique of SMS. In addition, it does not involve an SMSC for this operation.

The USSD commands are routed back to the home mobile network’s Home Location Register (HLR), allowing for the virtual home environment concept. It gives the ability for services (based on USSD) to work just as well and in exactly the same way when users are roaming.

Both SIM Application Toolkit and the Wireless Application Protocol(WAP) support USSD and this technology works on all existing GSM mobile phones

USSD Opearation

USSD is used to send text between the user and some application. USSD acts as a trigger rather than an application itself. However, it enables other applications such as prepaid services. In reality, it is hard to bill for USSD directly. Consequently, the bill is charged for the application associated with the use of USSD such as circuit switch data, SMS, or prepaid.

Format of a USSD message

A typical USSD message starts with an asterisk (*) and is followed by digits that comprise commands or data. Groups of digits may be separated by additional asterisks. The message is terminated with a number sign (#).

Example USSD codes:

*101#
*109*72348937857623#

Network Gateway

My blogs mostly contain facts about terms I hear at work daily; This is another term(Gateway) which I came across and thought of doing some research on it to discover what exactly it does.

Gateways happen to work on all seven OSI layers(The OSI, which means Open System Interconnection, is a network model that defines a networking framework for implementing protocols in seven layers). A gateway converts protocols among communications networks(also called as Protocol Converters). Consequently different networks using different protocols can communicate with each other. It is essential for a gateway to understand the protocols used by each network linked into the router.

A gateway may consist of devices such as protocol translators, impedance devices, rate converters, fault isolators, or signal translators as necessary to provide system interoperability. In addition, it is essential to establish mutually acceptable administrative procedures between both networks.

A router actually acts as a gateway too. However the difference lies in the fact that a router by itself transfers, accepts and relays packets only across networks using similar protocols. A gateway, on the other hand is able to accept a packet which is formatted for one protocol (e.g. AppleTalk) and convert it to a packet formatted for another protocol (e.g. TCP/IP) before forwarding it. A gateway can be implemented in hardware, software or a combination both, but are usually implemented by software installed within a router. Comparatively, gateways are slower than bridges, switches and (non-gateway) routers.

Another way to define a gateway is as a network point that acts as an entrance to another network.

On the Internet, it is possible for a node or stopping point to be either a gateway node or a host (end-point) node. Further explaining, both the computers of Internet users and the computers that serve pages to users are host nodes, while the nodes that connect the networks in between are gateways.
E.g.:- the computers that control traffic between company networks or the computers used by internet service providers (ISPs) to connect users to the internet are gateway nodes.

There can be gateways for different technolgies for example, SMS gateways, USSD gateways etc. So the concept of network gateway is as simple as that!

Friday, October 22, 2010

GSM vs CDMA

GSM= Global System for Mobile Communications

CDMA= Code Division Multiple Access

These are two main competing network technologies in the cellular service world. What exactly is the difference between these two technologies might be worth knowing, if you are in a position to choose either one of these services for your needs. Some of the major features of these technologies can be compared as follows.

The core difference between GSM and CDMA has to do with the way the data is converted into the radio waves which the cellphone broadcasts and receives.GSM divides the frequency bands into multiple channels so that more than one user can place a call through a tower at the same time; CDMA networks layer digitized calls over one another, and unpack them on the back end with sequence codes.

CDMA and GSM are both Radio technologies and need a mobile Phone which is used as an Antenna to recieve these Radio Signals. Both the technologies work on different frequency range and has different modulation schemes. Thus the mobile phones for both are different.However a 3G mobile can serve for both GSM and CDMA radio signals as 3G technology is a combination of both GSM and CDMA (WCDMA).

SIM (Subcriber Idendity Module) is used to provide Credential for network authorization. As both GSM and CDMA technologies have different Authorization functions and parameters, their SIMs are different as well.Furthermore, in CDMA technology the SIM is often called as RUIM (Removable User Idendity Module). However, not all the companies provide SIM cards for CDMA technology based services. In GSM, there is the chance of losing all your data when the SIM is lost if you have not stored in phone's memory.

2G GSM networks can offer better coverage in mountainous terrain, as they utilize taller cell towers. Additionally, it is possible for GSM (and UMTS) phones to send and receive data packets while making a call, which most CDMA networks still don't support.

A CDMA-only phone is only able to roam on other CDMA networks, which is a huge disadvantage when travelling. However this limitation does not effect GSM carriers. People with a tri-band GSM phone can use it in almost every part of the world.

Although the Call Costs are becoming cheaper every other day for GSM, the lowest call charges are offered by CDMA as yet. In addition, roaming call costs in GSM continue to be higher than CDMA. As of now the call quality of CDMA is better than GSM too.

Call quality is comparatively secure & good in GSM than CDMA.

Further, more and more value-added services like GPRS, EDGE etc are getting added to GSM which makes it all the more popular. In addition, CDMA services have not yet got the facility for web based services like messenger, downloading ringtones etc from websites.

When it comes to power consumption, it is less in GSM handsets as compared to CDMA handsets.

But, GSM phones are not full proof & can be tampered with. Moreover, there is no availability of a variety of handsets in CDMA as in GSM for customers and they are incompatible with GSM handsets as well. GSM services offer a variety of handsets & service providers are available to choose from.

Thursday, October 21, 2010

Mobile Positioning

This is a term I came across today, which I was unfamiliar with. Positioning the Mobile for what purpose and how? These were some mind boggling questions which sprang up on my head. Simply put, Mobile Positioning is used to provide Location Based Services (LBS) to various parties, as well as wireless emergency services.

So what is LBS? A location-based service or LBS is an information and entertainment service, which is accessible with mobile devices through the mobile network and utilizes the ability to make use of the geographical position of the mobile device. Some examples for such LBS would be, discovering the nearest bank ATM or the whereabouts of an employee, vehicle tracking services, personalized weather reports etc.

It should be implied by now that Mobile positioning refers to determining the position of the mobile device which in turn is used to provide LBS. Mobile Positioning should not be confused with Mobile Location as the terms are sometimes used interchangeably in conversation. However, they are really two different things. Mobile location actually refers to the location estimate, derived from the mobile positioning operation.

There are various means of mobile positioning. However it can be divided into two major categories

Network based positioning
Handset based positioning

Network-based Mobile Positioning Technology

Here, the mobile network, in conjunction with network-based Position Determination Equipment (PDE) is used to position the mobile device. Consequently,this category is referred to as "network based". Examples for this technology would be,

1. SS7 and Mobile Positioning
This is one of the easiest means of positioning the mobile user!

It leverages the SS7 network to derive location. When a user invokes a service that requires the Mobile Switching Center (MSC) to launch a message to a LBS residing on a SCP(Service Control Point), the MSC may launch a SS7 message. This SS7 message contains the Cell of Origin (COO) or Cell ID (of the corresponding cell site currently serving the user). The COO may be used by LBS to approximate the location of the user. This type of positioning therefore has a large degree of uncertainty as COO has the potential of covering a large area. Thus, it should be taken into account by the LBS application when maintaining the quality of service.

2. Network based PDE

COO is not always available.
E.g:- via SS7 with non-GSM WAP based services
In addition COO does not it always assure the quality of the LBS application. Hence in such situations, network-based (or handset based) PDE must be employed.

3. Angel of Arrival Method

When using this method, the Angle Of Arrival (AOA) of a signal between the mobile phone and the cellular antenna is analyzed. AOA PDE is used to capture AOA information to make calculations in order to determine an estimate of the mobile device position.

4. Time of Arrival Method

The Time Of Arrival (TOA) of signals between the mobile phone and the cellular antenna is used in the calculations of this method. TOA PDE is used to capture the Time Difference Of Arrival (TDOA) information. Then this information is used to make calculations to determine an estimate of the mobile device position.

5. Radio Propagation Techniques

Such techniques make use of a previously determined mapping of the radio frequency (RF) characteristics in order to determine an estimate of the mobile device position.

6. Hybrid Methods

These hybrid method include a combination of AOA and TOA and use the best of both to provide a better positioning result.

Handset-based Mobile Positioning Technology

The handset itself is the primary means of positioning the user in this method. However, the network can be used to provide assistance in acquiring the mobile device and/or making position estimate determinations based on measurement data and handset based position determination algorithms.

1. SIM ToolKIT

The SIM Toolkit (STK), acts as an API between the Subscriber Identity Module (SIM) of a GSM mobile phone and an application. It provides the means of positioning a mobile unit. The positioning information provided by this method may be as approximate as COO or maybe better through the usage of additional means such as using of the mobile network operation called Timing Advance (TA) or a procedure called Network Measurement Report (NMR). In all cases, the STK allows for communication between the SIM (which may contain additional algorithms for positioning) and a location server application (which may contain additional algorithms to assist in mobile positioning). STK is a good technique to obtain position information when the mobile device is in the idle state.

2. Enhanced Observed Time Difference (E-OTD)

The basic method is employed as with TOA, except the fact that the handset is far more actively involved in the positioning process of this method. Specially equipped handsets are required for this purpose.This method is also referred to as reversed TOA or handset based TOA.

3. GPS

Well this is a popular name. Even I know it!!

GPS or Global Positioning System is perhaps the best known or recognized handset-based PDE. When satellites are available GPS can be the most accurate method by itself. However this technology is often enhanced by the network. Assisted GPS (A-GPS) refers to a PDE system that makes use of additional network equipment which are deployed to help acquire the mobile device (much faster than non-assisted GPS) and provide positioning when the A-GPS system is unsuccessful in acquiring any/enough satellites.

Mobile IN technologies for Positioning

This can be deployed to assist in the positioning process.

The value of mobile IN is in leveraging the SS7 and IN network to obtain location, especially for mid-call/session position updates. Mobile IN may also be quite valuable for idle call positioning. However it requires integration on the mobile network side to ensure current position information is made available.

The Importance of LBS Middle-ware

LBS middle-ware are application that do not provide the services themselves, but rather enable location based services.

The location manager function of LBS Middle-ware may be employed to convert positioning information into useful location information and make it available for LBS applications. One of the major advantages of this location manager function is enabling the use of various positioning technologies in conjunction with various LBS applications. Hence it acts as a gateway or hub for location.

Tuesday, October 19, 2010

Why use Open Source?

On a brief note, Open Source Software(OSS) are programs whose licenses give users the freedom to run the program for any purpose, to study and modify the program, and to redistribute copies of either the original or modified program. Also called as Free Software, OSS tend to be the most influential software in the world today. However, it is fascinating to discover that Open source is not just about Software. There are other various products such as hardware, beverages, digital products and products related to health & science although the main focus in this article is on OSS.

Compared to proprietary software such as Microsoft Applications, Open Source Software possess some unique qualities which makes it all the more popular among developers. Even though Microsoft provides a small amount of free products such as Internet Explorer, what makes OSS special is the fact that it allows the modification and the redistribution of the software. Listed below are only a few reasons to start choosing OSS as our first choice.

1. Cost & Usage
what is the strongest factor of OSS that has won the hearts of millions?
Well, obviously, it being free of charge or rather less cost is the most benevolent factor of using Open Source Software.
For example:-
Ubuntu, which is a Linux distribution, is free to download, easy to install, easy to use, and easy to update, and comes without the tiresome issue of product licensing. Moreover, it is possible to install as many instances of Ubuntu as we want, and be confident that a Linux desktop will slot into existing networks without a fuss.
In addition, there is no requirement of installing Anti-virus software due to the fact that Linux platforms not easily being prone to any virus attacks or hacking attempts.
Software such as OpenOffice which provide almost the same functionalties as any Ms Office package requires no license and can be installed on Windows systems at zero cost. Databases such as MySQL Server, IDEs such as Eclipse, Web Servers such as Apache can all be downloaded for free through OSS.

2. Support
If almost all the OSS are free, then how do Open Source Companies earn any kind of revenue?
The answer lies the support mechanisms of these companies where they happen to provide quality and reliable support service, training and maintenance procedures for the customers. Most customer companies are more than happy to pay those amounts as the profit of using OSS clearly overshadows the cost of maintenance compared to any other product on the market. The customer support of commercial OSS projects are unparalleled to any other project in the market as the sole income of those project depends on that. E.g.:- JBoss (now a subsidiary of Red Hat)

Open source gives Independent Software Vendors a host of advantages. As, Open source companies foster and benefit from their user and developer communities, a user with a particular concern or requirement can often gain access to the individual developer resulting in more rapid and responsive support. However, if the support isn't good enough for our taste, or we feel that we have the internal resources to maintain the product by ourselves, we can always download the software for free.

3. Quality
One factor contributing to the quality of OSS is that major hardware, mobile phone and chip manufacturers have not only contributed their ideas and software under the GPL and its variants, but have also actively participated in free software projects.
When using OSS, cost might not always be the primary motive, although it's significant. Few more reasons empowering the usage of OSS are its reliability, resiliency, and adaptability.
Another example of how the industries make use of Open Source Software is the adopting of Linux and other open source solutions by the telecommunication and finance sectors on a large scale because of massive price/performance improvements over Unix and Windows.

Good habits is another significant factor contributing to the quality of the OSS. These so-called 'good habits' are followed in the maintenance of the software as processes and discussions being recorded and archived, and transparency, simplicity, modularity and portability throughout software development being maintained. In addition, the mailing lists, version control systems and bug tracker databases also enforce good habits on the developers as geographical separation of the developers is a common factor affecting the OSS projects.
Such practices reduce the inevitable overkill and duplication of code which are common for commercial development environments.

4. TCO
Using of OSS provides the unmistakable advantage of lowering the Total Cost of Ownership or rather TCO, thanks to its reliability, security and flexibility.

Security- It costs less in terms of time and effort to achieve and maintain an acceptable level of security

Stability & Reliability - Reduced maintenance and support overhead

5. Security

How much did you spend on securing a Windows system? I bet it's more than you would like to admit.

All Linux and Unix systems are almost free from any kind of security threat which dread Windows users and it is considered as major advantage of open source software. Thus,the trojans, viruses and malicious interlopers which are common to Windows systems are unknown to the majority of Linux users (Linux distributions include update mechanisms and also provide instantaneous security advisories).
Furthermore, Linux systems are built for networking and have a superior record for overall security. It is not entirely immune from outside threats, but the system architecture is much less vulnerable to attack.

6. Upgrades, anyone?
The unnecessary upgrades which are forcibly put upon you every year by those other commercial product vendors are unfamiliar instances to the Linux users.
For example, Linux and OpenOffice will run on lower spec PCs and fulfill the functionality required of 95 per cent of Office users.

7. Single vendor dependence
Open source removes the need for depending on single vendor solutions such as Microsoft products that leaves us with a limited choices and which tend to push up prices.

However, Linux is available in a large variety of flavours, being able to run on a greater variety of computer architectures than any other operating system, and be available on many different platforms from all the main hardware vendors. Hence, it is easy to move from one Linux to another, and from Linux to another operating system.

8. Interoperability and open standards
These are two of the principles which form the base of Open Source concept.

Maintaining Open standards for document formats and protocols is a first principle of open source software as it provides a clean intersection between different implementations of software and hardware.
Interoperability, is a goal of every kind of computing product since the beginning of the electronic era, meaning that computer systems should produce outputs in common formats which allow one computer to talk to another.

The purpose of open standards is to promote interoperability between different applications on different operating systems which provides the exact opposite effect of proprietary data formats that encourage reliance on single vendor applications and discourage the implementation of competitive products.
Open standards allow users to be platform, vendor and software independent.These standards make networking possible, and make it easier to upgrade and move customised software solutions from one platform to another.

9. Access to technology at the source
As Open source has allowed and encouraged and extended support to research and development laboratories in academia, public service and commercial industries, the advantage of accessing technologies that might otherwise be prohibitively expensive has been received. This, in turn has led to increased participation and feedback.
E.g:-
GNU/Linux and open source have led the field in clustering and virtualisation technologies, which were initially developed from academic research. (A side effect of this is that Linux has revived the market for the mainframe).
Another successful example of who has utilized Linux resulting in superior achievements is Google, which based their operations on the use of free S/W in the early days. Customizing Linux and the Google file system on clustered servers to build its original search and storage algorithms was the initial step in climbing the ladder of success of this Search Engine Giant.

10. Freedom
I suppose OSS users happen to know the meaning of freedom more than anyone else in the world as they have the freedom to run the software, copy or distribute the software, modify and redistribute the software which are matters of ending up in prison for many other software. This kind of freedom is only available to developers and users who have found refuge in the Open Source Community.

Examples of open-source software products are:

GNU Project-“a sufficient body of free software”
FreeBSD-operating system derived from Unix
Alfresco-content management system
Apache-HTTP web server
Tomcat web server-web container
Drupal-content management system
Eclipse-software development environment comprising an integrated development environment(IDE)
Apache-HTTP web server
Ephesoft-intelligent document capture, mailroom automation
Joomla-content management system
Linux-operating system based on Unix
Mediawiki-wiki server software, the software that runs Wikipedia
MongoDB-document-oriented, non-relational database
Moodle-course management system or vitual learning environment
Mozilla Firefox-web browser
Mozilla Thunderbird-email client
OpenBSD-operating system derived from Unix
OpenOffice.org-office suite
OpenSIS-open source Student Information System
OpenSolaris-Unix Operating System from Sun Microsystems
osCommerce-ecommerce
PeaZip-File Archiver
PHP-Scripting language suited for the web
Stockfish-chess engine series, considered to be one of the strongest chess programs of the world
Symbian-real time mobile operating system
TYPO3-content management system
WordPress-content management system-blog software
7-ZIP-File Archiver
Many, many more

Some interesting and mind-blowing statistics for OSS

Market Share

1. The most popular web server has always been OSS/FS since such data have been collected. For example, Apache is the current #1 web server.

2. GNU/Linux is the #2 web serving OS on the public Internet (counting by physical machine), according to a study by Netcraft surveying March and June 2001.

OS group	Percentage (March)	Percentage (June)	Composition
Windows	49.2%	49.6%	Windows 2000, NT4, NT3, Windows 95, Windows 98
[GNU/]Linux	28.5%	29.6%	[GNU/]Linux
Solaris	7.6%	7.1%	Solaris 2, Solaris 7, Solaris 8
BSD	6.3%	6.1%	BSDI BSD/OS, FreeBSD, NetBSD, OpenBSD
Other Unix	2.4%	2.2%	AIX, Compaq Tru64, HP-UX, IRIX, SCO Unix, SunOS 4 and others
Other non-Unix	2.5%	2.4%	MacOS, NetWare, proprietary IBM OSes
Unknown	3.6%	3.0%	not identified by Netcraft OS detector

3. GNU/Linux is the #1 server OS on the public Internet (counting by domain name), according to a 1999 survey of primarily European and educational sites.

Operating System	Market Share	Composition
GNU/Linux	28.5%	GNU/Linux
Windows	24.4%	All Windows combined (including 95, 98, NT)
Sun	17.7%	Sun Solaris or SunOS
BSD	15.0%	BSD Family (FreeBSD, NetBSD, OpenBSD, BSDI, ...)
IRIX	5.3%	SGI IRIX

4. GNU/Linux was the #2 server OS sold in 1999, 2000, and 2001.

5. Evans Data survey published in November 2001 found that 48.1% of international developers and 39.6% of North Americans plan to target most of their applications to GNU/Linux. In October 2002, they found that 59% of developers expect to write Linux applications in the next year.

6. An IBM-sponsored study on Linux suggested that GNU/Linux has “won” the server war as of 2006, as 83% were using GNU/Linux to deploy new systems versus only 23% for Windows.

7. Half of all mission-critical business applications are expected to run on GNU/Linux by 2012

8. An Evans Data survey made public in February 2004 found that 1.1 million developers in North America were working on OSS/FS projects.

9. 2004 InformationWeek survey found that 67% of companies use OSS/FS products, with another 16% expecting to use it in 2005; only 17% have no near-term plans to support OSS/FS products.

10. A Japanese survey found widespread use and support for GNU/Linux; overall use of GNU/Linux jumped from 35.5% in 2001 to 64.3% in 2002 of Japanese corporations, and GNU/Linux was the most popular platform for small projects.

The use of Linux servers in user enterprises:-

System	2002	2001
Linux server	64.3%	35.5%
Windows 2000 Server	59.9%	37.0%
Windows NT Server	64.3%	74.2%
Commercial Unix server	37.7%	31.2%

Reasons:-

Increase of importance in the future	44.1%
Requirement from their customers	41.2%
Major OS in their market	38.2%
Free of licence fee	37.5%
Most reasonable OS for their purpose	36.0%
Open source	34.6%
High reliability	27.2%

11. Microsoft sponsored its own research to “prove” that GNU/Linux is not as widely used, but this research has been shown to be seriously flawed.

12. Business plan to increase their use of GNU/Linux.

Expected GNU/Linux Use	Small Business	Midsize Business	Large Business	Total
50% increase	21.0%	16%	19.0%	19%
10-25% increase	30.5%	42%	56.5%	44%
No growth	45.5%	42%	24.5%	36%
Reduction	3.0%	0%	0%	1%

13. The global top 1000 Internet Service Providers expect GNU/Linux use to increase by 154%, according to Idaya’s survey conducted January through March 2001.

14. IBM found a 30% growth in the number of enterprise-level applications for GNU/Linux in the six month period ending June 2001.

15. Revenue from sales of GNU/Linux-based server systems increased 90% in the fourth quarter of 2002 compared to the fourth quarter of 2001.

16. In a survey of business users by Forrester Research Inc., 52% said that they are now replacing Windows servers with Linux.

17. A 2001 survey found that 46.6% of IT professionals were confident that their organizations could support GNU/Linux, a figure larger than any OS except Windows.

18. MailChannel’s survey (published 2007) showed that the top two email servers (Sendmail and Postfix) are OSS/FS programs.

19. A survey in the second quarter of 2000 found that 95% of all reverse-lookup domain name servers (DNS) used bind, an OSS/FS product.

20. A survey in May 2004 found that over 75% of all DNS domains are serviced by an OSS/FS program.

21. PHP is the web’s #1 Server-side Scripting Language

22. OpenSSH is the Internet’s #1 implementation of the SSH security protocol.

23. CMP TSG/Insight found that 41% of application development tools were OSS/FS, and VARBusiness found 20% of all companies using GNU/Linux.

24. MySQL’s market share is growing faster than Windows’.

25. Internet Explorer has been losing marketshare to OSS/FS web browsers (such as Mozilla Firefox) since mid-2004, a trend especially obvious in leading indicators such as technology sites, web development sites, and bloggers.

26. As of 2004, a CSC study determined that an astonishing 14% of the large enterprise office systems market are using OSS/FS OpenOffice.org.

27. A February 2005 survey of developers and database administrators found that 64% use an Open Source database.

28. BusinessWeek reports that hardware companies are selling more than $1 billion in servers to run Linux every quarter

29. InformationWeek’s February 2005 survey reported significant use of GNU/Linux, and that that 90% of companies anticipate a jump in server licenses for GNU/Linux.

30. Optaros, a consulting firm, reports that 87% of organizations are now using open-source software; BusinessWeek claims that this demonstrates that OSS/FS has greatly expanded into businesses.

31. IDC’s Spring 2006 survey found that developers around the world are increasing their use of OSS/FS.

Reliability

1. A study by Reasoning found that the Linux kernel’s implementation of the TCP/IP Internet protocol stack had fewer defects than the equivalent stacks of several proprietary general-purpose operating systems, and equalled the best of the embedded operating systems.

2. A similar study by Reasoning found that the MySQL database (a leading OSS/FS database) had fewer defects than a set of 200 proprietary programs used for comparison.

3. A study by Coverity found that the Linux kernel had far fewer defects than the industry average.

4. Sites using Microsoft’s IIS web serving software have over double the time offline (on average) than sites using the Apache software, according to a 3-month Swiss evaluation.

Downtime	Apache	Microsoft	Netscape	Other
September	5.21	10.41	3.85	8.72
October	2.66	8.39	2.80	12.05
November	1.83	14.28	3.39	6.85
Average	3.23	11.03	3.35	9.21

5. 80% of the top ten most reliable hosting providers ran OSS/FS, according to Netcraft’s May 2004 survey

6. A detailed study of two large programs (the Linux kernel and the Mozilla web browser) found evidence that OSS/FS development processes produce more modular designs.

Program	Change Cost
Mozilla-1998-04-08	17.35%
Mozilla-1998-10-08	18.00%
Mozilla-1998-12-11	2.78%
Mozilla-1999	3.80%
Linux-2.1.88	3.72%
Linux-2.1.105	5.16%

	Linux 2.1.105	Mozilla 1998-04-08	Mozilla 1998-12-11
Number of Source files	1678	1684	1508
Coordination Cost	20,918,992	30,537,703	10,234,903

7. German import company Heinz Tröber found Linux-based desktops to be far more reliable than Windows desktops; Windows had a 15% daily failure rate, while Linux has 0%.

Performance

1. In February 2003, scientists broke the Internet2 Land Speed Record using GNU/Linux.

2. Benchmarks comparing Sun Solaris x86 and GNU/Linux found many similarities, but GNU/Linux had double the performance in web operations.

3. Anandtech’s August 2005 comparison of Mac OS X and GNU/Linux found that the Linux-based ssytem ran five to eight times faster on server tasks (specifically using MySQL).

Microsoft themselves found that two OSS/FS operating systems, Linux and FreeBSD, had better performance than Windows by many measures.

Scalability

GNU/Linux dominates in supercomputing: GNU/Linux is used in 78% of the world’s 500 fastest supercomputers use GNU/Linux, most of the world’s ten fastest supercomputers... including the world’s most powerful supercomputer (as of March and November 2005).

Security

1. J.S. Wurzler Underwriting Managers’ “hacker insurance” costs 5-15% more if Windows is used instead of Unix or GNU/Linux for Internet operation.

2. Most defaced web sites are hosted by Windows, and Windows sites are disproportionately defaced more often than explained by its market share.

3. Linux systems last longer than unpatched Windows systems, according to a combination of studies from the Honeynet Project, AOL, and others.

4. A 2002 survey of developers found that GNU/Linux systems are relatively immune from attacks from outsiders.

5. Apache has a better security record than Microsoft’s IIS, as measured by reports of serious vulnerabilities.

6. Surveys report that GNU/Linux systems experience fewer viruses and successful cracks.

7. According to a June 2004 study by Sandvine, 80% of all spam is sent by infected Windows PCs.

Total Cost of Ownership

1. OSS/FS costs less to initially acquire.

	Microsoft Solution	OSS/FS (GNU/Linux) Solution	Savings by using GNU/Linux
Company A (50 users)	$69,987	$80	$69,907
Company B (100 users)	$136,734	$80	$136,654
Company C (250 users)	$282,974	$80	$282,894

2. Upgrade/maintenance costs are typically far less.

3. OSS/FS does not impose license management costs, does not in practice include noxious licensing clauses, and avoids nearly all licensing litigation risks.

4. OSS/FS can often use older hardware more efficiently than proprietary systems, yielding smaller hardware costs and sometimes eliminating the need for new hardware.

5. When used as an application server based system, the total costs for hardware drop by orders of magnitude.

6. An Italian study in 2002 found GNU/Linux to have a TCO 34.84% less than Windows.

7. Forrester Research found that the average savings on TCO when using OSS/FS database management systems (DBMSs) is 50%.

8. Even Microsoft has admitted that its products are more costly than GNU/Linux.

The Open Source Definition

Open source software is officially defined by the open source definition. It is indicated below for further understanding of the concept.

The distribution terms of open-source software must comply with the following criteria:

1. Free Redistribution

The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.

2. Source Code

The program must include source code, and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost preferably, downloading via the Internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed.

3. Derived Work

The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.

4. Integrity of the Author's source code

The license may restrict source-code from being distributed in modified form only if the license allows the distribution of “patch files” with the source code for the purpose of modifying the program at build time. The license must explicitly permit distribution of software built from modified source code. The license may require derived works to carry a different name or version number from the original software.

5. No discrimination against Persons or Groups

The license must not discriminate against any person or group of persons.

6. No discrimination against Fields of Endeavor

The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.

7. Distribution of License

The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.

8. License Must Not be Specific To a Product

The rights attached to the program must not depend on the program’s being part of a particular software distribution. If the program is extracted from that distribution and used or distributed within the terms of the program’s license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution.

9. The License Must Not Restrict Other Software

The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software.

10. No provision of the license may be predicated on any individual technology or style of interface

About Me