The tutorial has ten tracks.
- Track 1: Overview and set-up for the entire tutorial. This track leads you to create several cloud objects for all other tutorial tracks.
- Track 2: Create a basic Kafka system and a KCE. This track focuses on the centerpiece of your data lake — the Kafka system. It also shows you how to use the KCE to manage the Kafka system. Note that the Kafka system created in this track does not have the Kafka-Connect component.
- Track 3: Deploy the Kafka-Connect and a reader. This track updates the Kafka system created in Track 2 with a Kafka-Connect component. We will create a reader and deploy it to the Kafka-Connect component.
- Track 4: Create a simple pipeline. The pipeline created in this track parses the dirty data in the input topic. It also demonstrates how to use parsers.
- Track 5: Create an intermediate pipeline. The pipeline created in this track performs dedup and lookups. It also demonstrates how to call a lookup.
- Track 6: Create an advanced pipeline. This pipeline created in this track performs aggregation, which is more complicated than previous tracks.
- Track 7: Create a writer. You should take data within a Kafka system as transient. Eventually, they will be aged out. This track works on a writer that streams data into permanent storage.
- Track 8: Emergency preparedness. This track outlines what to do if you experience malfunctions in the Kafka system. In general, a Kafka system is very resilient to errors. But you may experience issues related to hard limits such as quota, disk capacity, or network failures. You cannot run your real-time system without first planning for emergencies.
- Track 9: Create a microservice under a PCA. This track shows you how to quickly deploy your API service and get SSL support from a PCA.
- Track 10: Create a microservice for a Kafka system. This track is similar to Track 9. But in this track, you need to access a Kafka system in your microservice. We will set up such an environment with all parameter files and SSL prepared for your app.
The dependency among tutorial tracks is light. So you can jump around tracks. Each of them is organized into tightly dependent topics that need to be read sequentially.
We try to keep each topic clear and short. The instructions focus on the action part of your work, i.e., the hands-on skills for accomplishing the goals. The background information may be sketchy. Readers are encouraged to read the user manuals for more detailed explanations.
The best way of following the tutorial is to become hands-on. Of course, you will incur some cloud resource costs in the process. We provide estimates of the cloud resource cost at various places in the tutorial. Also, you can use Calabash CLI to undeploy all resources in just a few minutes. This feature will be handy when you decide to take a break. You can quickly restore the objects using Calabash CLI.
We are confident that once you have gone through these tracks, you will find Calabash is extremely easy to use, and with it, you will instantly become a real-time expert. At the same time, you will also save on cloud expenses.
Finally, Calabash is not a canned solution. It is a development platform that offers you an ample amount of freedom for your creativity. With it, you can implement any processing logic your business needs.