Big Data
What it is and why it matters
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.
History of Big Data
The term “big data” refers to data that is so large, fast or complex that it’s difficult or impossible to process using traditional methods. The act of accessing and storing large amounts of information for analytics has been around a long time. But the concept of big data gained momentum in the early 2000s when industry analyst Doug Laney articulated the now-mainstream definition of big data as the three V’s:
Volume: Organizations collect data from a variety of sources, including business transactions, smart (IoT) devices, industrial equipment, videos, social media and more. In the past, storing it would have been a problem – but cheaper storage on platforms like data lakes and Hadoop have eased the burden.
Velocity: With the growth in the Internet of Things, data streams in to businesses at an unprecedented speed and must be handled in a timely manner. RFID tags, sensors and smart meters are driving the need to deal with these torrents of data in near-real time.
Variety: Data comes in all types of formats – from structured, numeric data in traditional databases to unstructured text documents, emails, videos, audios, stock ticker data and financial transactions.
At SAS, we consider two additional dimensions when it comes to big data:
Variability:
In addition to the increasing velocities and varieties of data, data flows are unpredictable – changing often and varying greatly. It’s challenging, but businesses need to know when something is trending in social media, and how to manage daily, seasonal and event-triggered peak data loads.
Veracity:
Veracity refers to the quality of data. Because data comes from so many different sources, it’s difficult to link, match, cleanse and transform data across systems. Businesses need to connect and correlate relationships, hierarchies and multiple data linkages. Otherwise, their data can quickly spiral out of control.
Optimized production with big data analytics
At USG Corporation, using big data with predictive analytics is key to fully understanding how products are made and how they work. And in a market with a barrage of global competition, manufacturers like USG know the importance of producing high-quality products at an affordable price. Using the SAS Platform, USG has removed guesswork and optimized its production investments. The results: improved product quality and time to market.
Why Is Big Data Important?
The importance of big data doesn’t revolve around how much data you have, but what you do with it. You can take data from any source and analyze it to find answers that enable 1) cost reductions, 2) time reductions, 3) new product development and optimized offerings, and 4) smart decision making. When you combine big data with high-powered analytics, you can accomplish business-related tasks such as:
- Determining root causes of failures, issues and defects in near-real time.
- Generating coupons at the point of sale based on the customer’s buying habits.
- Recalculating entire risk portfolios in minutes.
- Detecting fraudulent behavior before it affects your organization.
Big Data in Today’s World
Big data – and the way organizations manage and derive insight from it – is changing the way the world uses business information. Learn more about big data’s impact.
Data Integration Deja Vu: Big Data Reinvigorates DI
To stay relevant, data integration needs to work with many different types and sources of data, while operating at different latencies – from real time to streaming. Learn how DI has evolved to meet modern requirements.
Building your data and analytics strategy
Wondering how to build a world-class analytics organization? Make sure information is reliable. Empower data-driven decisions across lines of business. Drive the strategy. And know how to wring every last bit of value out of big data.
Data lake and data warehouse – know the difference
Is the term "data lake" just marketing hype? Or a new name for a data warehouse? Phil Simon sets the record straight about what a data lake is, how it works and when you might need one.
New Analytics Ecosystem
Cloud, containers and on-demand compute power – a SAS survey of more than 1,000 organizations explores technology adoption and illustrates how embracing specific approaches positions you to successfully evolve your analytics ecosystems.
Who's focusing on big data?
Big data is a big deal for industries. The onslaught of IoT and other connected devices has created a massive uptick in the amount of information organizations collect, manage and analyze. Along with big data comes the potential to unlock big insights – for every industry, large to small.
- Select an industry
- Retail
- Manufacturing
- Banking
- Health Care
- Education
- Small and Midsize Businesses
- Government
- Insurance
Retail
Customer relationship building is critical to the retail industry – and the best way to manage that is to manage big data. Retailers need to know the best way to market to customers, the most effective way to handle transactions, and the most strategic way to bring back lapsed business. Big data remains at the heart of all those things.
Manufacturing
Armed with insight that big data can provide, manufacturers can boost quality and output while minimizing waste – processes that are key in today’s highly competitive market. More and more manufacturers are working in an analytics-based culture, which means they can solve problems faster and make more agile business decisions.
Banking
With large amounts of information streaming in from countless sources, banks are faced with finding new and innovative ways to manage big data. While it’s important to understand customers and boost their satisfaction, it’s equally important to minimize risk and fraud while maintaining regulatory compliance. Big data brings big insights, but it also requires financial institutions to stay one step ahead of the game with advanced analytics.
Health Care
Patient records. Treatment plans. Prescription information. When it comes to health care, everything needs to be done quickly, accurately – and, in some cases, with enough transparency to satisfy stringent industry regulations. When big data is managed effectively, health care providers can uncover hidden insights that improve patient care.
Education
Educators armed with data-driven insight can make a significant impact on school systems, students and curriculums. By analyzing big data, they can identify at-risk students, make sure students are making adequate progress, and can implement a better system for evaluation and support of teachers and principals.
Small and Midsize Businesses
Between the ease of collecting big data and the increasingly affordable options for managing, storing and analyzing data, SMBs have a better chance than ever of competing with their bigger counterparts. SMBs can use big data with analytics to lower costs, boost productivity, build stronger customer relationships, and minimize risk and fraud.
Government
When government agencies are able to harness and apply analytics to their big data, they gain significant ground when it comes to managing utilities, running agencies, dealing with traffic congestion or preventing crime. But while there are many advantages to big data, governments must also address issues of transparency and privacy.
Insurance
Telematics, sensor data, weather data, drone and aerial image data – insurers are swamped with an influx of big data. Combining big data with analytics provides new insights that can drive digital transformation. For example, big data helps insurers better assess risk, create new pricing policies, make highly personalized offers and be more proactive about loss prevention.
Deep learning craves big data because big data is necessary to isolate hidden patterns and to find answers without over-fitting the data. With deep learning, the more good quality data you have, the better the results. Wayne Thompson SAS Product Manager
Data-driven innovation
Today’s exabytes of big data open countless opportunities to capture insights that drive innovation. From more accurate forecasting to increased operational efficiency and better customer experiences, sophisticated uses of big data and analytics propel advances that can change our world – improving lives, healing sickness, protecting the vulnerable and conserving resources.
How Big Data works
Before businesses can put big data to work for them, they should consider how it flows among a multitude of locations, sources, systems, owners and users. There are five key steps to taking charge of this big “data fabric” that includes traditional, structured data along with unstructured and semistructured data:
- Set a big data strategy.
- Identify big data sources.
- Access, manage and store the data.
- Analyze the data.
- Make data-driven decisions.
1) Set a big data strategy
At a high level, a big data strategy is a plan designed to help you oversee and improve the way you acquire, store, manage, share and use data within and outside of your organization. A big data strategy sets the stage for business success amid an abundance of data. When developing a strategy, it’s important to consider existing – and future – business and technology goals and initiatives. This calls for treating big data like any other valuable business asset rather than just a byproduct of applications.
2) Know the sources of big data
- Streaming data comes from the Internet of Things (IoT) and other connected devices that flow into IT systems from wearables, smart cars, medical devices, industrial equipment and more. You can analyze this big data as it arrives, deciding which data to keep or not keep, and which needs further analysis.
- Social media data stems from interactions on Facebook, YouTube, Instagram, etc. This includes vast amounts of big data in the form of images, videos, voice, text and sound – useful for marketing, sales and support functions. This data is often in unstructured or semistructured forms, so it poses a unique challenge for consumption and analysis.
- Publicly available data comes from massive amounts of open data sources like the US government’s data.gov, the CIA World Factbook or the European Union Open Data Portal.
- Other big data may come from data lakes, cloud data sources, suppliers and customers.
3) Access, manage and store big data
Modern computing systems provide the speed, power and flexibility needed to quickly access massive amounts and types of big data. Along with reliable access, companies also need methods for integrating the data, ensuring data quality, providing data governance and storage, and preparing the data for analytics. Some data may be stored on-premises in a traditional data warehouse – but there are also flexible, low-cost options for storing and handling big data via cloud solutions, data lakes and Hadoop.
4) Analyze big data
With high-performance technologies like grid computing or in-memory analytics, organizations can choose to use all their big data for analyses. Another approach is to determine upfront which data is relevant before analyzing it. Either way, big data analytics is how companies gain value and insights from data. Increasingly, big data feeds today’s advanced analytics endeavors such as artificial intelligence.
5) Make intelligent, data-driven decisions
Well-managed, trusted data leads to trusted analytics and trusted decisions. To stay competitive, businesses need to seize the full value of big data and operate in a data-driven way – making decisions based on the evidence presented by big data rather than gut instinct. The benefits of being data-driven are clear. Data-driven organizations perform better, are operationally more predictable and are more profitable.
Next Steps
Big data demands sophisticated data management and advanced analytics techniques. SAS has you covered.
SAS Data Preparation
To prepare fast-moving, ever-changing big data for analytics, you must first access, profile, cleanse and transform it. With a variety of big data sources, sizes and speeds, data preparation can consume huge amounts of time. SAS Data Preparation simplifies the task – so you can prepare data without coding, specialized skills or reliance on IT.
Recommended reading
- インタビュー アナリティクスの現在と未来 前編:アナリティクスの現在を語る(1/4)気鋭のデータサイエンティスト 孝忠大輔氏をNECビッグデータ戦略本部からお招きし、SASのコンサルタントと3名による特別鼎談を行いました。
- インタビュー Hadoopに関するスクープ情報Cloudera社の共同創業者であるマイク・オルソン氏が、Hadoopの動向、変化、成功方程式について語ります。
- 記事 ダビデ vs ゴリアテ High-Performance Analyticsを武器に、ビッグデータで勝利するビッグデータに関する記事や広告の多くは、ビッグデータの3つのV:Velocity(速度)、Variety(種類)、Volume(量)について語っています。しかし、ビッグデータ活用はビジネスプランありきであるということを忘れないでください。目まぐるしく変化する価値をどう引き出しビジネスに役立てるか、そこが本質なのです。
今日の世界におけるビッグデータ
ビッグデータと、それを適切に管理して洞察を引き出す手法が今、この世界におけるビジネス情報の活用方法を変革しつつあります。以下の資料では、ビッグデータの影響をより具体的にご確認いただけます。
中堅企業におけるビッグデータの活用方法
データの分析と視覚化(ビジュアライゼーション)から実用的な結果を引き出し、競争優位性を獲得する方法について、7つの実践的なヒントをご紹介します。
ビジネス担当者のためのビッグデータ攻略本
このホワイトペーパーは、基本的には技術に詳しいものの専門家とまではいえないビジネス担当者を対象として、Hadoopの活用方法と企業における今後のデータ環境に及ぼす影響を解説しています。
要約を読む(ポップアップウィンドウが開きます)
ビッグデータとデータマイニング
データマイニング専門家のジャレッド・ディーン(Jared Dean)氏がデータマイニングについて執筆した著書をご紹介します。ハイパフォーマンス・コンピューティングと高度なアナリティクスを活用して、アナリティクス・プログラムの効果を最大限に高めるにはどうすればよいかが論じられています。