Skip to content

Posts from the ‘Hadoop’ Category

Gartner Search Analytics Shows Spike in Hadoop Inquiries in 2012 – Good News For CRM

Hadoop was one of the most-searched terms on Gartner’s website in 2011 through 2012, spiking to 601.8% over the last twelve months alone.  Additional insights from the Search Analytics on Hadoop include the following:

  • 27% of all inquiries are from banking, finance and insurance industries, followed by manufacturing (14%), government (13%), services (10%) and healthcare (8%).
  • North America (75.9%) and EMEA (13.5%) are the two most dominant geographies in terms of query volume.
  • Here is the trend line from Gartner Search Analytics:

What’s driving Hadoop’s meteoric rise in searches is a combination of industry hype about big data, CIOs getting serious about using Hadoop distributions that minimize time and risk yet deliver value, and the dominant role Amazon is playing in bringing Hadoop into the cloud.  Today Amazon offers Elastic MapReduce as a Web Service that relies on a hosted Hadoop framework running the Elastic Compute Cloud (EC2) in conjunction with Amazon Simple Storage Service (S2).

Microsoft also scored a major hiring win this week announcing that Raghu Ramakrishnan, former chief scientist for three divisions of Yahoo is now with Microsoft. Raghu is now a technical fellow working in the Server and Tools Business (STB).  He’ll focus on big data and integration to STB platforms.  Big Data on Azure will accelerate now with him on-board.

Hadoop’s Potentially Galvanizing Effect on CRM and Social CRM Analytics

The quickening pace of Hadoop adoption in the enterprise is good news for CRM and especially social CRM. Analytics and Business Intelligence (BI) are the “glue” that unify CRM and keep it in context. One of Hadoop’s greatest potential contributions is the analysis, categorization and use of unstructured content.  Marketing and sales won’t have to run three or four systems to gain insights into customer data, they can run a single analytics platform that fuels the entire selling cycle and lifetime customer value chain of their businesses.  Hadoop has the potential to make unstructured content more meaningful while also reporting the impact of customer insights on financial performance, profitability and lifetime customer value.

Translating terabytes of customer, sales, services and partner data into meaningful analytics and business intelligence (BI) is emerging as a priority for CIOs, who are sharing responsibility for driving top-line revenue growth.   Hadoop shows potential to be the “glue” or galvanizing technology base that unifies all CRM and Social CRM strategies.

To get a perspective on how fast Hadoop is being evaluated and adopted it’s useful to look at the Hype Cycle for Data Management, the latest edition published July, 2011.   This is another indicator of how quickly Hadoop and big data are gaining in terms of CIO mindshare.  Big Data and extreme information management are on the technology Trigger area of the hype cycle.  The Hype Cycle for Data Management is shown below:

Bottom line:  CRM and Social CRM will benefit more than any other area of an enterprise as Hadoop’s adoption continues to accelerate.  CIOs are increasingly called upon to be strategists, and with the ability to translate terabytes of data into strategies that deliver dollars, look for Hadoop’s contributions to drive top-line revenue growth.

Data Science Shows Potential To Redefine Cloud-based Analytics

The emerging field of data science is a fascinating one that has major implications on the potential of cloud-based analytics, CRM, search, supply chain management and logistics.

Instead of relying purely on latent semantic indexing or the Google PageRank algorithm to define relevance of a search, data science techniques analyze content and its context to determine relevance.  Google today looks at the content of a page; data science considers its surrounding data and relevance.

Earlier this month TechCrunch published the blog post Marissa Mayer’s Next Big Thing: “Contextual Discovery” — Google Results Without Search.  The techniques of contextual discovery Google is experimenting with rely on a very rapid aggregation and transforming of data, which are part of the methodologies of data science.   When Google moves fully into contextual discovery the potential exists for cloud-based analytics, CRM, search, supply chain management and logistics to be completely revolutionized by solving the big data problems associated with each of these areas.

In CRM, this would mean finally being able to access external and internal content (including the massive amount of data on social networks), aggregate the data, and transform it into meaningful analysis.  The vision of social CRM would be realized once data science serves as the catalyst of contextual search or as Google calls it, contextual discovery.

Exploring Data Science

Two of the best blog posts are both from O’Reilly Radar on the emerging topic of data science.  What is data science? By Mike Loukides and Six months after “What is data science?” by Mac Slocum O’Reilly Radar are worth reading and giving some serious thought to.  O’Reilly also has also created a free report titled What is Data Science, which can be downloaded here.

Authors Mike Loukides and Mac Slocum set the foundation for how transformational data science has the potential of being by concentrating on the nascent area of data products.  A data product is the result of accessing, aggregating and transforming content regardless of its location – and capturing data on its attributes – not just the data itself. Both authors point to reference systems and guided reference engines on e-commerce sites as just the beginning.  Yet after reading their assessments and listening to Roger Magoulas, O’Reilly’s Director of Research, interviewed about data science below there are many more potential uses of this evolving area.

Potential Impact of Data Science on Analytics

The blog posts by Mike Loukides and Mac Slocum go into detail explaining how each area of data science is in varying levels of maturity.  After reading these over and considering the big data problems in cloud-based analytics, CRM, search, supply chain management and logistics, the following methodology starts to make sense:

Access – For data science to realize its full potential there needs to be a technology layer that provides for real-time access to structured and unstructured content both within and outside an enterprise.  More than a traditional Enterprise Application Integration (EAI) layer the technologies driving data access need to selectively pull all available content from every unstructured and structured data source available.  Mike Loukides mentions Google Goggles and how MapReduce has made this application possible.  Hadoop as a means to create greater access across federated content has much potential in this phase as well.

Aggregate – Called data conditioning by Mike Loukides, the aggregation phase is where contextual discovery happens.  This could be accomplished through contextual search filters, taxonomies defined by specific alerts, or the use of the MapReduce and Hadoop query and relevance tools in use today.

Transform – Where Hadoop could be used for driving data analysis and as Mike Loukides calls this level of analysis, data jiujitsu.   Examples are mentioned by both Mike Loukides and Mac Slocum including the Hadoop Online Prototype (HOP), which does real-time stream processing and several others.  The impact of the access, aggregate and transform methodology on visualization is available at Flowing Data, one of the best sites on the Web for seeing how MapReduce, Hadoop and other data science-related techniques are taking on massive amounts of data and delivering insights.

Conclusion

Solving the big data problems of social media monitoring, sentiment analysis, forming a scalable platform for social CRM, integrating CRM, supply chain management and logistics data to demand management – and tying all of these areas to financial performance – is potentially achievable with data science.  Deployed as a cloud-based platform opens up even greater potential for getting the most use of social networks, free data sources, and third-party databases than is possible today.

Be sure to check out the video below of Roger Magoulas, O’Reilly’s Director of Research, where he was interviewed about data science.

Article links:

What is data science? By Mike Loukides  O’Reilly Radar
Six months after “What is data science?”  by Mac Slocum O’Reilly Radar

Hadoop Predicted To Be More Disruptive than Linux

Abhishek Mehta is Managing Director for Big Data and Analytics for Bank of America, and serves as Executive in Residence at MIT Media Lab.  SiliconAngle.tv founder John Furrier and Wikibon co-founder David Vellante interviewed him at Hadoop World last month.  Abhishek sees Hadoop as being more disruptive than Linux, and leading to the formation of data factories.  He also sees Hadoop giving programmers greater freedom to concentrate on creating algorithms that solve much larger, more complex problems than is possible today.

Here is a quote from the interview:
“So these data factories are going to emerge as the new drivers of innovation of a massive revolution that will change fundamentally how business models extract value, because data is going to be, is the core asset in a multitude of industries.”

At just under 30 minutes, this is a fascinating look into the future of Hadoop.

Transcript of the interview by Bert Latamore

Source attribution: SiliconAngle.tv video

Follow

Get every new post delivered to your Inbox.

Join 100 other followers