Hadoop as you know is not really a database, although it pretty much acts like one. It is a distributed file system specialized to facilitate storage and processing of monumental volumes of data, pretty much regardless of data format. It is not a good idea to query any of that data in real-time however.
Open source Hadoop is roughly a decade old, though most serious development has occurred in the last five years. Essentially, it is meant for to handle large data processing and is distributed file system which is written in Java, it does represent the future of data. Apache Hadoop is a framework for running applications on large cluster built of commodity hardware and is a common way of avoiding data loss in through replication: redundant copies of the data are kept by the system so that in the event of failure, there is another copy available. The Hadoop distributed filesystem (FDFS), takes care of this problem. Secondly, with the help of mapreduce (Popular open source implementation),a powerful tool designed for deep analysis and transformation of very large data sets.
Created by Mike Cafarella in 2005 and Doug Cutting (who named it after his son's toy elephant), Hadoop was originally intended for web-related search data. Today, it is an open-source, community-built project of the Apache Software Foundation that is used in all kinds of organisations and industries. Microsoft is an active contributor to the community development effort.
Microsoft has logged over 6,000 engineering hours in the last year, committing code and driving innovation in partnership with the open source community across a range of Hadoop projects. In addition, we have committers on Hadoop, and Microsoft employee Chris Douglas is the Apache Working Group Chair for Hadoop, said David Campbell, Microsoft Fellow and CTO
Built for big data, everyday servers
One reason for Hadoop's popularity is simple economics. Processing big data sets once required supercomputers and other pricey, specialised hardware. Hadoop makes reliable, scalable, distributed computing possible on industry-standard servers–allowing you to tackle petabytes of data and beyond on smaller budgets. Hadoop is also designed to scale from a single server to thousands of machines and detect and handle failures at the application layer for better reliability.
Researchers at Virginia Tech are using Hadoop to sift through petabytes of DNA data for new cancer therapies and antibiotics.
Hadoop is Insights from all kinds of data
A report shows that, as much as 80 percent of the data organisations deal with today is not the kind which comes neatly packaged in columns and rows. Instead, it is a messy avalanche of emails, social media feeds, satellite images, GPS signals, server logs and other unstructured, non-relational files. Hadoop can handle nearly any file or format–its other big advantage–so organisations can pose questions they never thought possible.
By using Windows Azure, HDInsight and SQL Server 2012, we can collect, analyse and generate near-real time BI with Big Data collected from social media feeds, GPS signals and data from government systems. Said, Luis Sanz Marco, City of Barcelona
City of Barcelona Deploys Big Data BI Solution to Improve Lives and Create a Smart-City Template
Barcelona, Spain, wanted better insight into government effectiveness. To achieve this, it needed a solution that could collect and analyze data from its systems and new public sources such as social media, software log files, and GPS signals. Working with Microsoft partner Bismart, the city built a hybrid cloud-based Big Data solution that features data and business intelligence services based on Windows Azure, Windows Azure HDInsight Service, Microsoft SQL Server 2012 including Analysis Services, and Microsoft Excel 2010. As a result, Barcelona provides near real-time insight into structured and unstructured data, supports BI access using any Internet-connected device, improves services and business opportunities, improves safety and public health, and provides a smart-city template that boosts collaboration between Barcelona, its citizens and businesses, and other global.
To summarize, Hadoop can dramatically increase your speed of innovation.
See What’s Next in Tech With the Fast Forward Newsletter
Tweets From @varindiamag
Nothing to see here - yet
When they Tweet, their Tweets will show up here.