Showing posts with label Generalized. Show all posts
Showing posts with label Generalized. Show all posts

Tuesday, January 17, 2023

Generalized Architecture - SPA

Streaming Data System Architecture Components
• Collection
• Data Flow
• Processing
• Storage
• Delivery




Collection System 

• Mostly communication over TCP/IP network using HTTP
• Websites log data was the initial days use case
• W3C standard log data format was used
• Newer formats like JSON, AVRO, Thrift are available now
• Collection happens at specialized servers called edge servers
• Collection process is usually application specific
• New servers integrates directly with data flow systems
• Old servers may or may not integrate directly with data flow systems

Data Flow Tier

• Separation between collection tier and processing layer is required
• Rates at which these systems works are different
• What if one of system is not able to cope with another system?
• Required intermediate layer that takes responsibility of
• accepting messages / events from collection layer
• providing those messages / events to processing layer
• Real time interface to data layer for both producers and consumers of data
• Helps in guaranteeing the “at least once” semantics

Processing / Analytics Tier

• Based on “data locality” principle
• Move the software / code to a the location of data
• Rely on distributed processing of data
• Framework does the most of the heavy lifting of data partitioning, job scheduling, job managing
• Available Frameworks
• Apache Storm
• Apache Spark (Streaming)
• Apache Kafka Streaming etc

Storage Tier

In memory or permanent
• Usually in memory as data is processed once
• But can have use cases where events / outcomes needs to be persisted as well
• NoSQL databases becoming popular choice for permanent storage
• MongoDB
• Cassandra
• But usage varies as per the use case, still no database that fits all use cases

Delivery Layer  

• Usually web based interface
    • Now a days mobile interfaces are becoming quite popular
    • Dashboards are built with streaming visualizations that gets continuously updated as
underlying events are processed
    • HTML + CSS + Java script + Websockets can be used to create interfaces and update
them
    • HTML5 elements can be used to render interfaces
    • SVG, PDF formats used to render the outcomes
• Monitoring / Alerting Use cases
• Feeding data to downstream applications






---------------------------------------------------------------------------- 
All the messages below are just forwarded messages if some one feels hurt about it please add your comments we will remove the post.Host/author is not responsible for these posts.