SmartSellTM - The New Way to Sell Online

We won't be beaten by anyone. Guaranteed

Real-Time Analytics
By

Rating

Product Description
Product Details

Table of Contents

Introduction xv Chapter 1 Introduction to Streaming Data 1 Sources of Streaming Data 2 Operational Monitoring 3 Web Analytics 3 Online Advertising 4 Social Media 5 Mobile Data and the Internet of Things 5 Why Streaming Data Is Different 7 Always On, Always Flowing 7 Loosely Structured 8 High-Cardinality Storage 9 Infrastructures and Algorithms 10 Conclusion 10 Part I Streaming Analytics Architecture 13 Chapter 2 Designing Real-Time Streaming Architectures 15 Real-Time Architecture Components 16 Collection 16 Data Flow 17 Processing 19 Storage 20 Delivery 22 Features of a Real-Time Architecture 24 High Availability 24 Low Latency 25 Horizontal Scalability 26 Languages for Real-Time Programming 27 Java 27 Scala and Clojure 28 JavaScript 29 The Go Language 30 A Real-Time Architecture Checklist 30 Collection 31 Data Flow 31 Processing 32 Storage 32 Delivery 33 Conclusion 34 Chapter 3 Service Configuration and Coordination 35 Motivation for Confi guration and Coordination Systems 36 Maintaining Distributed State 36 Unreliable Network Connections 36 Clock Synchronization 37 Consensus in an Unreliable World 38 Apache ZooKeeper 39 The znode 39 Watches and Notifi cations 41 Maintaining Consistency 41 Creating a ZooKeeper Cluster 42 ZooKeeper?s Native Java Client 47 The Curator Client 56 Curator Recipes 63 Conclusion 70 Chapter 4 Data-Flow Management in Streaming Analysis 71 Distributed Data Flows 72 At Least Once Delivery 72 The ?n+1? Problem 73 Apache Kafka: High-Throughput Distributed Messaging 74 Design and Implementation 74 Configuring a Kafka Environment 80 Interacting with Kafka Brokers 89 Apache Flume: Distributed Log Collection 92 The Flume Agent 92 Configuring the Agent 94 The Flume Data Model 95 Channel Selectors 95 Flume Sources 98 Flume Sinks 107 Sink Processors 110 Flume Channels 110 Flume Interceptors 112 Integrating Custom Flume Components 114 Running Flume Agents 114 Conclusion 115 Chapter 5 Processing Streaming Data 117 Distributed Streaming Data Processing 118 Coordination 118 Partitions and Merges 119 Transactions 119 Processing Data with Storm 119 Components of a Storm Cluster 120 Configuring a Storm Cluster 122 Distributed Clusters 123 Local Clusters 126 Storm Topologies 127 Implementing Bolts 130 Implementing and Using Spouts 136 Distributed Remote Procedure Calls 142 Trident: The Storm DSL 144 Processing Data with Samza 151 Apache YARN 151 Getting Started with YARN and Samza 153 Integrating Samza into the Data Flow 157 Samza Jobs 157 Conclusion 166 Chapter 6 Storing Streaming Data 167 Consistent Hashing 168 ?NoSQL? Storage Systems 169 Redis 170 MongoDB 180 Cassandra 203 Other Storage Technologies 215 Relational Databases 215 Distributed In-Memory Data Grids 215 Choosing a Technology 215 Key-Value Stores 216 Document Stores 216 Distributed Hash Table Stores 216 In-Memory Grids 217 Relational Databases 217 Warehousing 217 Hadoop as ETL and Warehouse 218 Lambda Architectures 223 Conclusion 224 Part II Analysis and Visualization 225 Chapter 7 Delivering Streaming Metrics 227 Streaming Web Applications 228 Working with Node 229 Managing a Node Project with NPM 231 Developing Node Web Applications 235 A Basic Streaming Dashboard 238 Adding Streaming to Web Applications 242 Visualizing Data 254 HTML5 Canvas and Inline SVG 254 Data-Driven Documents: D3.js 262 High-Level Tools 272 Mobile Streaming Applications 277 Conclusion 279 Chapter 8 Exact Aggregation and Delivery 281 Timed Counting and Summation 285 Counting in Bolts 286 Counting with Trident 288 Counting in Samza 289 Multi-Resolution Time-Series Aggregation 290 Quantization Framework 290 Stochastic Optimization 296 Delivering Time-Series Data 297 Strip Charts with D3.js 298 High-Speed Canvas Charts 299 Horizon Charts 301 Conclusion 303 Chapter 9 Statistical Approximation of Streaming Data 305 Numerical Libraries 306 Probabilities and Distributions 307 Expectation and Variance 309 Statistical Distributions 310 Discrete Distributions 310 Continuous Distributions 312 Joint Distributions 315 Working with Distributions 316 Inferring Parameters 316 The Delta Method 317 Distribution Inequalities 319 Random Number Generation 319 Generating Specific Distributions 321 Sampling Procedures 324 Sampling from a Fixed Population 325 Sampling from a Streaming Population 326 Biased Streaming Sampling 327 Conclusion 329 Chapter 10 Approximating Streaming Data with Sketching 331 Registers and Hash Functions 332 Registers 332 Hash Functions 332 Working with Sets 336 The Bloom Filter 338 The Algorithm 338 Choosing a Filter Size 340 Unions and Intersections 341 Cardinality Estimation 342 Interesting Variations 344 Distinct Value Sketches 347 The Min-Count Algorithm 348 The HyperLogLog Algorithm 351 The Count-Min Sketch 356 Point Queries 356 Count-Min Sketch Implementation 357 Top-K and ?Heavy Hitters? 358 Range and Quantile Queries 360 Other Applications 364 Conclusion 364 Chapter 11 Beyond Aggregation 367 Models for Real-Time Data 368 Simple Time-Series Models 369 Linear Models 373 Logistic Regression 378 Neural Network Models 380 Forecasting with Models 389 Exponential Smoothing Methods 390 Regression Methods 393 Neural Network Methods 394 Monitoring 396 Outlier Detection 397 Change Detection 399 Real-Time Optimization 400 Conclusion 402 Index 403

About the Author

BYRON ELLIS is CTO of Spongecell, where he heads research and development. Previously the Chief Data Scientist for LivePerson and CTO at AdBrite, Ellis holds a Ph.D. in Statistics from Harvard University, and a B.S. in Cybernetics from UCLA. He presents sessions on real-time analytics at Strata and other major conferences.

Ask a Question About this Product More...
Write your question below:
Look for similar items by category
How Fishpond Works
Fishpond works with suppliers all over the world to bring you a huge selection of products, really great prices, and delivery included on over 25 million products that we sell. We do our best every day to make Fishpond an awesome place for customers to shop and get what they want — all at the best prices online.
Webmasters, Bloggers & Website Owners
You can earn a 5% commission by selling Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data on your website. It's easy to get started - we will give you example code. After you're set-up, your website can earn you money while you work, play or even sleep! You should start right now!
Authors / Publishers
Are you the Author or Publisher of a book? Or the manufacturer of one of the millions of products that we sell. You can improve sales and grow your revenue by submitting additional information on this title. The better the information we have about a product, the more we will sell!
Item ships from and is sold by Fishpond Retail Limited.
Back to top