AWS IoT Analytics – Working with Data and Analytics
AWS IoT Analytics is a service that is used to collect, process, and analyze data that is obtained from IoT devices. You can process and analyze large datasets from IoT devices with the help of IoT Analytics without the need for complex infrastructure or programming. You can apply mathematical and statistical models to your data to make sense of it and make better decisions accordingly. You can also integrate it with many other services from AWS, such as Amazon S3 or Amazon QuickSight, to perform further analytical and visualization workloads.
The following are components of IoT Analytics that are crucial for you to know that we will be using as we go through our exercise within the next subsection:
Channel: A channel is used to collect data from a select Message Queuing Telemetry Transport (MQTT) topic and archive unprocessed messages before the data is published to the pipeline. You can either use this or send messages to the channel directly through the BatchPutMessage API. Messages that are unprocessed will be stored within an S3 bucket that will be managed either by you or AWS IoT Analytics.
Pipeline: Pipelines consume messages that come from a channel and allow you to process the messages before then storing them within a data store. The pipeline activities then perform the necessary transformations on the messages that you have, such as renaming, adding message attributes, or filtering messages based on attribute values.
Data store: Pipelines then store the processed messages within a data store, which is a repository of messages that can be queried. It is important to make the distinction between this and a database as it is not a database but is more like a temporary repository. Multiple data stores can be provisioned for messages that come from different devices or locations, or you can have them filtered by their message attributes depending on how you configure your pipeline along with its requirements. The data store’s processed messages will also be stored within an S3 bucket that can be managed either by you or AWS IoT Analytics.
Dataset: Data is retrieved from a data store and made into a dataset. IoT Analytics allows you to create a SQL dataset or a container dataset. You can further explore insights in your dataset through integration with Amazon QuickSight or Jupyter Notebook. Jupyter Notebook is an open source web application that allows you to create and share documents containing live code, equations, visualizations, and narrative text, and is often used for data cleaning and transformation, numerical simulation, statistical modeling, data visualization, and ML. You can also send the contents of a dataset to an S3 bucket, allowing you to then enable integration with existing data lakes or in-house applications that you may have to perform further analysis and visualization. You can also send the contents to AWS IoT Events to trigger certain actions if there are failures or changes in operation.
SQL dataset: An SQL dataset is like the view that would be had from a materialized view of an SQL database. You can create SQL datasets by applying an SQL action.
Trigger: A trigger is a component you can specify to create a dataset automatically. It can be a time interval or based on when the content of another dataset has been created.
With an understanding of these components, we can look at other services that we will also come across in our practical exercises.
Leave a Reply