This article is the user manual for Calabash CLI.
Table of Content:
- Start Calabash CLI
- Connect to the Calabash Repository
- Set Data System Context
- Get Info about Deployable Objects
- Deploy/Undeploy an Object
- Set up Local TLS (SSL) Client
- Set up Kafka Client Environment (KCE)
- Get PCA Access Token
- Get Access Token of API Service Reader
- Generate Pipeline Code
- Import/Export Metadata
- Get Help
1. Start Calabash CLI
To start Calabash CLI, first launch a command window, then cd to the Clanash CLI top directory. In this directory, you should see these two subdirectories:
bin libs
To start Calabash CLI on Mac and Linux:
bin/calabash.sh
To start Calabash CLI on Windows:
bin\calabash.bat
You will see a banner and a Calabash CLI prompt.
Calabash CLI, version 3.0.0 Data Canals, 2022. All rights reserved.
Calabash >
You are now in the Calabash CLI interactive shell.
2. Connect to the Calabash Repository
You must connect to your account in Calabash Repository, where you can find the design of your system objects. The format of the command is
connect LOGIN_NAME
where LOGIN_NAME is your login to Calabash GUI. You will then type your password, which is not displayed.
Calabash > connect jdoe
Password:
Connected.
Calabash (jdoe)>
The prompt now shows the connected user.
3. Set Data System Context
You must select a data system as your context. A data system owns all other objects. The format of setting data system context is
use ds DATA_SYSTEM_NAME
For example:
Calabash (jdoe)> use ds lake1 Current ds set to lake1 Calabash (jdoe:lake1)>
The prompt now has a second part showing the data system context.
To see available data systems, use command
list ds
For example
Calabash (jdoe)> list ds DS Name Platform Description -------------------- -------- --------------------------------------------------- lake1 GCP This is a data lake system for demo purpose.
You can inspect the details of a data system. This is done by the desc command
desc ds DATA_SYSTEM_NAME
For example
Calabash (jdoe)> desc ds lake1 name: lake1 platform: GCP desc: This is a data lake system for demo purpose. num of iobjs: 7 num of parsers: 3 num of lookups: 8 num of readers: 9 num of pipelines: 3 num of writers: 2
The details show the numbers of all types of objects in this data system.
4. Get Info about Deployable Objects
You can issue commands to obtain metadata about various objects. The following table shows what types of object you can query and their abbreviations. You will need to use these abbreviations in the commands.
Object Type | Abbreviation |
---|---|
Data System | ds |
Infrastructure Object | i |
Reader | r |
Pipeline | p |
Writer | w |
Note that parsers and lookups are not in this list because they are functions, not deployable objects. In Calabash CLI, the goal is to deploy objects to the cloud. So you can find info about deployable objects. You should use Calabash GUI to get info about parsers and lookups.
To show all available objects of certain type, use this command
list OBJ_TYPE
where OBJ_TYPE is the abbreviated object type as shown in the above table.
For example,
Calabash (jdoe:lake1)> list i Obj Name Project Description Type Status -------------------- --------------- ---------------------------------------- ------------------------------ ---------------------------------------- lake1-pca dlb-internal This is the PCA for data system lake1 Private Certificate Authority Deployed to 35.202.118.20:8081 at 4/20/2 021, 4:51:53 PM >>> Created at 4/14/2021 , 2:38:34 PM vm1 dlb-internal Virtual Machine Undeployed at 4/14/2021, 8:20:52 PM >>> Created at 4/14/2021, 4:51:53 PM msvm1 dlb-internal Microservice on VM >>> Created at 4/14/2021, 5:08:34 PM cluster1 dlb-internal Kubernetes Cluster >>> Created at 4/14/2021, 10:48:02 PM msk8s1 dlb-internal Microservice on Kubernetes >>> Created at 4/15/2021, 12:05:09 AM cluster2 dlb-internal Kubernetes Cluster >>> Created at 4/15/2021, 12:07:20 AM kafka1 dlb-internal Messaging Storage (Kafka) >>> Created at 4/15/2021, 6:34:21 PM
In the listing, the status column provides both deployment status and change status.
To get detailed information about an object, use this command
desc OBJ_TYPE OBJ_NAME
where OBJ_TYPE is the type abbreviation and OBJ_NAME is the name of the object to describe.
For example,
Calabash (jdoe:lake1)> desc i lake1-pca name: lake1-pca proj: dlb-internal desc: This is the PCA for data system lake1 type: Private Certificate Authority attrs: {region=us-central1, zone=us-central1-a, network=lake1, subnet=subnet-us-central1, mt=n1-standard-1 [cpus: 1, mem-gb: 3.84], useExistingBootDisk=false, existingBootDisk=, id2=u8H7A4RVD+ytEBdgP8uItA==, id1=AAAADHN/MPZTKRBCaqluQo4D/pTt3hXtv7oIRqYmICRACQQaZxJEhmwy1CJj7SecPKn/2kJzBSrcmvo+TdPeBoR7xrw=} deployedInfo: null updateInfo: null status: Deployed to 35.202.118.20:8081 at 4/20/2021, 4:51:53 PM >>> Created at 4/14/2021, 2:38:34 PM
The description shows all the metadata of this object.
5. Deploy/Updeploy an Object
Use the following command to deploy an object to the cloud
deploy OBJ_TYPE OBJ_NAME
For example
Calabash (jdoe:lake1)> deploy i lake1-pca
Deployed to 35.202.118.20:8081
Calabash (jdoe:lake1)>
The deploy command will respond with a “Deployed to …” message if all is fine, or an error message like “Failed ….”
Similarly, the undeploy command has this format:
undeploy OBJ_TYPE OBJ_NAME
For example
Calabash (jdoe:lake1)> undeploy i lake1-pca
Undeployed
Calabash (jdoe:lake1)>
“Undeployed” is a good response. An error message starts with “Failed ….”
6. Set up Local TLS (SSL) Client
You can use Calabash CLI to create an SSL key and get a PCA-signed certificate. Using the SSL key and cert, you can communicate with services in your data system using TLS (SSL) from your local machine.
There is a preparatory step before setting up local TLS (SSL): you must set up an SSH tunnel to the VM where the PCA is running.
In Google Cloud SDK, issue this command to set up an SSH tunnel:
gcloud compute ssh PCA_VM_NAME --zone VM_ZONE --ssh-flag "-L 8081:localhost:8081"
where PCA_VM_NAME is the VM name of the PCA, and VM_ZONE is the zone of this VM. The PCA is an internal service. Your local machine (outside the cloud) cannot access the PCA without the secure tunnel.
The tunneling command will create an secure shell to the VM, and you should leave it running to keep the SSH tunnel open.
After opening the SSH tunnel, you can issue the Calabash CLI command for creating an SSL client environment. The command has this format:
set-up-ssl PCA_NAME
where PCA_NAME is a name of a deployed PCA. The command will fail if the PCA does not exist in the Calabash repository or is not deployed.
An example:
DLB (jdoe:ds1)> set-up-ssl lake1-pca Is the PCA in the cloud? [y]: Please enter a dir for the security files: /Users/jdoe/my_ssl_files Deleted file jdoe-key.pem Deleted file jdoe.jks Deleted file jdoe-cert.pem Deleted file trust.jks Deleted file rtca-cert.pem Deleted dir /Users/jdoe/my_ssl_files mkdirs /Users/jdoe/my_ssl_files SSL certificate supported by PCA lf-pca for user jdoe is created in /Users/jdoe/my_ssl_files
This command first asks if the PCA has been deployed to the cloud. Calabash also supports PCAs running on-premise. In the above example, we hit the RETURN key to take the default, which means “PCA is in the cloud.”
The command then asks for a directory for generating the TLS (SSL) files. If the directory already exists, e.g., the “/Users/jdoe/my_ssl_files” in the above example, it will delete and recreate it. Finally, the command generates TLS (SSL) files into the empty directory.
You can find the following files in the “/Users/jdoe/my_ssl_files” directory.
% cd /Users/jdoe/my_ssl_files % ls jdoe-cert.pem jdoe-key.pem jdoe.jks rtca-cert.pem trust.jks %
The following table shows the purpose of each file.
File Name | Content |
---|---|
jdoe-key.pem | The private SSL key for jdoe in PEM format. |
jdoe-cert.pem | The SSL certificate for jdoe in PEM format. |
rtca-cert.pem | The SSL certificate for the PCA. The “rtca” stands for runtime certificate authority. This file is commonly called “cacert.” |
jdoe.jks | The Java keystore in JKS format. |
trust.jks | The Java truststore in JKS format. |
The first three files in the above table are the most important, i.e., the private key, its SSL certificate, and the cacert. The key and trust stores are created for the convenience of Java applications.
7. Set up Local Kafka Client Environment (KCE)
A Kafka Client Environment (KCE) contains scripts for managing a Kafka system deployed by Calabash. You can install a KCE on your local computer. Read the KCE User Manual for the concept of KCE and commands available in a KCE.
There is a preparatory step before creating a KCE. You must set up an SSH tunnel to the VM where the PCA is deployed. This tunnel is needed for setting up TLS (SSL) for the KCE.
In Google Cloud SDK, issue this command to set up an SSH tunnel:
gcloud compute ssh PCA_VM_NAME --zone VM_ZONE --ssh-flag "-L 8081:localhost:8081"
where PCA_VM_NAME is the VM name of the PCA. And VM_ZONE is the zone of this VM. To find a list of VMs with their zones, you can use this command:
gcloud compute instances list
The gcloud command for creating the SSH tunnel will start a secure shell. As long as it runs, the SSH tunnel is open.
After opening the SSH tunnel, you can issue the Calabash CLI command for creating a KCE on the local machine. The command has this format:
set-up-kce KAFKA_NAME
where KAFKA_NAME is a name of a Kafka system that has been deployed. This command will fail if the Kafka systen does not exist in the Calabash Repository or has not been deployed.
The “set-up-kce” command will ask you a series of questions for customizing the KCE. Here is an example interaction. In the following script, bold-faced parts are what you will type:
Calabash (jdoe:lake1)> set-up-kce kafka1 Is the PCA in the cloud? [y]: Enter the directory where Kafka is installed: /Users/jdoe/kafka Enter producer user names: jeff Enter consumer user names: april Enter Kafka client top dir: /Users/jdoe/kce Enter the optional kafkacat path: /usr/local/bin/kafkacat Enter the optional jq path: /usr/local/bin/jq Enter external ip of this machine: 192.168.0.17 Created /Users/jdoe/kce/properties/admin_jdoe.properties Created /Users/jdoe/kce/properties/producer_jeff.properties Created /Users/jdoe/kce/properties/consumer_april.properties KCE is successfully installed in /Users/jdoe/kce Calabash (jdoe:lake1)>
This command first asks if the PCA defined for the Kafka system is in the cloud. Calabash also supports PCAs running on-premise. In the above example, we hit the RETURN key to take the default, which means “PCA is in the cloud.”
Client access to the Kafka system is controlled by roles. There are admin, producer, and consumer roles. The superusers of a Kafka system have the admin role. Optionally, you can ask the KCE to support producers and consumers of your creation.
For example, in the above interaction, we ask the command to create support files for a producer named “jeff” and a consumer named “april.” The producer and consumer names may be arbitrary identifiers. They can also be the superusers of the Kafka system (such as “jdoe” in our example). Although the superusers can manage the entire system, they do not have producer/consumer authorization on Kafka topics.
The KCE will be installed in “/Users/jdoe/kce.”
There are five prerequisites for a local KCE to work.
- Prerequisite 1: You must have Kafka software installed on the local machine. In the above example, Kafka is in the “/Users/jdoe/kafka” directory. Calabash supports Kafka version 2.8 and up. (If you do not have Kafka on your local machine, it is easy to set it up. Just download Kafka binary here. Unpack it in any directory, and that is all.)
- Prerequisite 2: the Kafka object must exist in the Calabash repository. In the above example, the name of the Kafka system is “kafka1.”
- Prerequisite 3: the Kafka system must use public brokers. These brokers are visible on the internet. Otherwise, you will not be able to reach them.
- Prerequisite 4: the “kafkcat” must be installed. Without kafkacat, the scripts “list_reader_offset.sh” and “set_reader_offset.sh” will not work. Refer to here for information about the kafkacat.
- Prerequisite 5: the “jq” utility must be installed. It is used to “pretty print” command outputs. Refer to here for information about jq.
Important. The connected Calabash CLI user must be one of the superusers in the Kafka system. For example, “jdoe” in the above example is a superuser in the Kafka system. The “set-up-kce” command will fail if the connected user is not a superuser of the Kafka system.
8. Get PCA Access Token
The PCA is not different from any other certificate authority. If you know how to interact with an X509 Certificate Authority, you may write your own code to interact with it.
However, a PCA is protected by an access token. To request
an SSL key signing, you need an access token to the PCA.
If you own the PCA and the PCA is currently deployed, you can retrieve this access token. The format of the command is
at pca PCA_NAME
For example:
Calabash (jdoe:lake1)> at pca testpca Is the PCA in the cloud? [y]: eyJhbGciOiJIUzUxMiJ9.eyJzdWIiOiJ7XCJpZFwiOlwiMFwiLCBcImlkVHlwZVwiOlwiUENBLXRlc3RwY2FcIn0iLCJleHAiOjE2MDk3NTcxNjh9.4KcUplkzdL4qvwjXJHCITsMPL9j_vp4GV0-GzNlwnbl3-ZVkZNwZG2hipLdpTUX_HgO6XHBJSDuuiepUBbqGzw Calabash (jdoe:lake1)>
The returned access token is a JWT token and will expire in 10 days. Use token type “bearer” or “jwt” when adding it to the “authorization” property in the HTTP header.
Security warning: Treat this access token as a top-secret of your entire data system. If it is compromised, you have to recreate the PCA.
9. Get Access Token of API Service Reader
at r READER_NAMEFor example:
Calabash (jdoe:lake1)> at r apiservicevm Is the service in the cloud? [y]: eyJhbGciOiJIUzUxMiJ9.eyJzdWIiOiJ7XCJpZFwiOlwiMFwiLCBcImlkVHlwZVwiOlwiQXBpc2VydmljZS1hcGlzZXJ2aWNldm1cIn0iLCJleHAiOjE2MTAxNTEwMTd9.4Pm4MrnEoZwD2ShopS7F6iL0368-aVpu-TZHqDPPilqIix4zxJnGfXQ8RD4HEEBRDCWGtPCxkqQJW6Mq3JYAIQ" \ Calabash (jdoe:lake1)>The returned access token is a JWT token. It will expire in ten days. Use either “bearer” or “jwt” as token type when setting it in the “authorization” property in an HTTP header. You can reset the access token. The command format:
reset-at READER_NAMEIf the reset is successful, the previously issued access token becomes invalid. You will need this command if the access token is compromised or it is nearing its expiry.
10. Generate Pipeline Code
Calabash generates Kafka Streams code in Java for a data pipeline. The code generation happens during the pipeline deployment. If the generated code cannot compile, the deployment will fail.
You may wish to test the code generation before deploying your pipeline. It ensures the deployment will be a smooth process. The “code-gen” command is for that purpose. You will be able to manually generate the code, compile it, and fix any errors in the design of the pipeline.
The format of the code generation command is
codegen PIPELINE_NAME
Example:
Calabash (jdoe:lake1)> code-gen payment-p1 Enter a directory for the generated code: /tmp/payment-p1 mkdirs /tmp/payment-p1 mkdirs /tmp/payment-p1/generated/build/libs Codegen is successful!
As shown in the above example, the “code-gen” command asks you for a directory to generate the code in.
If the command is not successful, error messages will be printed. You can digest it and make changes in your pipeline. Then come back and redo the “code-gen.”
Finally, you can examine and manually compile the generated code. See the following example.
% cd /tmp/payment-p1/generated % ls -F README.md com/ gradlew.bat src/ build/ gradle/ out/ build.gradle gradlew* settings.gradle % find src src src/main src/main/resources src/main/java src/main/java/com src/main/java/com/dcs src/main/java/com/dcs/kafka src/main/java/com/dcs/kafka/pipeline src/main/java/com/dcs/kafka/pipeline/generated src/main/java/com/dcs/kafka/pipeline/generated/Parser.java src/main/java/com/dcs/kafka/pipeline/generated/CustomFunc.java src/main/java/com/dcs/kafka/pipeline/generated/Lookup.java % sh gradlew build BUILD SUCCESSFUL in 4s 3 actionable tasks: 3 executed
You do not need to install anything to compile the code.
11. Export/Import Metadata
export-ds EXPORT_SPEC_FILEThis command requires an “export specification file.” It is a properties file, specifying which objects you want to export. The following is an example of the export spec:
iobjs = * readers = * parsers = * lookups = * pipelines = * writers = * export-file = /tmp/lake_finance_metadata.propertiesIn this example, the wildcard * means “all the objects.” If you want to export objects selectively, list their names, separated by commas. You must define the “export-file” property, which is the export result path. The exported metadata will also be in the properties format. Hence the “.properties” suffix for the result file. But that is not essential. You may name the output file in whatever ways. The following is an example running of the export command. (The above export spec is in “/Users/jdoe/export_spec.properties.”)
Calabash (jdoe:lake_finance)> export-ds /Users/jdoe/export_spec.properties Export target file /tmp/lake_finance_metadata.properties exists. Overwrite this file? [y/n] y Exported 12 iobjs. Exported 10 readers. Exported 1 parsers. Exported 5 lookups. Exported 3 pipelines. Exported 7 writers. Calabash (jdoe:lake_finance)>The export command will ask if you want to overwrite the existing export result file. It will also show how many objects it has exported. The export utility encrypts every object. The following is a portion of the export result file:
exported-from = lake_finance salt = 03vugKL8KbFOah2DzC6iag== # # iobjs # iobj.fnet = AAAADEQ9MN1T7XsDi1WArRHRF180aGAZESvk09z/PJuaRHZuADpZ81bdOpeYQhcm44W3HMoAlmEuhLpT3rxZhof/u6qGNCkez4otrh/womBiVLVdg/YgOZgg4wMvFjGH/yI2GPIjgwc3RUS6gS7jJXf9KnPdNnTsVQ4Gc2iYfwD4+/HNamX9pmDJFbLo6tNLpWaJkSCY7GOJ/Hcv1ZgUywJZce+mRIQdD/fllUKyKNms/vdns/ApaBwmd3/wMV0qQ24xPMgwrT6W3TZK7iyS8bkfGYSiIthMvBDVweU6PL3N4zV8GjCjl7WGvftmuG8gkzLemq8kzyYE3VKPjgGsZ+KXBKP5PrjLFuOU7txe2qwMZb1DQ7Li6TwJDdMVFTKITvnEGhPzh/dEDCNr0gm7uvv6io39y3ArONRC7JAiopsJK6E6nNt7ZwAgjn7WnVLCYBfloZc+FhYCIaC5NRJ6bs/ODGdOB5cUlT4rPaPifGxG70WoWl5aBWiRA+0vQXgD/jiWAi5lFBKd6WZYaDDbYop4TV5OREMt+w8eBJ7I5UQL24skCOOJUYXvXcOIWWtYxzBT5tCoemLvReQNenTlaosSaoz6zFwGvsHQXPs25M/pgo07mQl3Gc4cBuiwmLysCnzRmN+bqy4KKgbIfPBM18rfL9ixLkkQw4MLNKjhNQ9jGtqQPcxnRDtWO7CuGdjXzWNkAJj6DpskWqdyzn00NgJpLNX/NWxpFzt+P2UY7ZCeUtPYB9vNqJNFHEMkZebh2dqw3Kk5t4LQrheIkYqoKeZphAP4m976i7p0i4pHIOZkyImt3xmAg8xfxePwT/CxU+/eY4FjZc+5BwN4L7MsuintGX/X01x3Nb6ybrRsxTNnekXl+OzhyIdAs48yedz/urLLoAynhqDN2uaAwaOpjX2yKx9TQ/llYXY7AlrGTKu0kRK/9BJwNaxRRrO2bLXwsfqAxoy1rt1Pti3PGJyQ6wN0UBO7cjAeWAqodnFkQ+h9t7NxHBdNKyV1ERFJvT7ig5yOGMrGrEuz6ElvIpXXpX6bs1ecnvfzQ5eqb0T/Uh8V0u9h9zZ7SBoQr69B8zuHo1z3C+Xei1Br0ZRJ+LGri525mEQUd3zLMmMljW8MCsiywkXHLAeCrX5l4qDaf0je1BXRIDBmmqu49k4xxDmwdAjq0CcJXzqk/ZziW7GUURhrhRJOu2DP8DBmlf4gw7234DtZetqctgB01nzNGHKvm/7YzofEUOEiQvgFGD2TWF9RMsZkg2T9uBeCmpsqZAb1VxDdbW+Um/z+ZKZPCRUZqVkZoJdCSjh+ZlQDFk3XuXNg8Aofq4lgjGO4JgKMdO298wXoVwsg9j9EcM0xLVXcgJKjqQ== iobj.lf-pca = AAAADF77x8u7MlNlKrj4KzbyanT3fibGqaifmibhz7C6QyDxi7FfxhRgCarotV0fuizSvgWsQgBVn5g9KykB3xM/+4klSGggKnbiyPfd/fdKFAOW5r+m4ekN18Jafq8Itu+uEoE+RECqoy0waPfGL06vWsh193VEQ8UvIW3JSzudIY8DR3++uwV7axBINK1h3CD68weCUvGRNVVoStZL2ljSUJJtD/+8A/ZDN1jO1ECXoMwUWs9tBcEyxAAmIro48TW26wEswectvpglpyfd5k7w+o72DaJiEFWBIiCK7gPt3BkYp1wz/wvWxLK/CDCggUBJaAfE6MGkOpdeQ4zAk0l3ThQAVivyU7pTil6f2mAYDeRXacj8Hq+/TzJKvjVyRFi+1D906a0o46hnNmu42FFHfLl1qSgJ0QRrUOXDEc4wRI99IABXyitJ5kzxiFN484nJKiwhN7FBglluKmf5 ... ...Only the “import-ds” can decrypt the scrambled text at the time of import. The “import-ds” command has the following format:
import-ds EXPORTED_METADATA_FILEIf an object to be imported already exists in the repository, the import utility will print an error message and skip the import of this object. The following is an example running of the “import-ds” command.
Calabash (jdoe:import-test)> import-ds /tmp/lake_finance_metadata.properties Importing iobj named hw-1 is successful. New id 615227a1755bb28db5ea741e Importing reader named eventlogger is successful. New id 615227a1755bb28db5ea7427 Importing iobj named hw-2 is successful. New id 615227a1755bb28db5ea742d Importing iobj named fnet is successful. New id 615227a1755bb28db5ea7433 Importing iobj named lf-kafka is successful. New id 615227a1755bb28db5ea7439 Importing lookup named currency-conversion-rate-apicall is successful. New id 615227a2755bb28db5ea743f Importing iobj named lf-pca is successful. New id 615227a2755bb28db5ea7445 Importing pipeline named payment-p3 is successful. New id 615227a2755bb28db5ea744e Importing writer named apitarget-writer-1 is successful. New id 615227a2755bb28db5ea7457 ...In this example, we created a new data system named “import-test” and imported all objects into it. Warning. Since the import utility will create objects in the repository, it is subject to quota control. If you have run out of quota, a message similar to the following will show up.
Failed importing iobj named vm1: quota limit has reached.
12. Get Help
Calabash > help connect - connect to Calabash server to access metadata help - show this message quit/exit/ctrl-C/ctrl-D - exit Calabash >After connect but before setting data system context:
Calabash (jdoe) > help list ds - list all data systems you can access desc ds - see details of the data system use ds - use a data system connect - connect to calabash server to access metadata help - show this message quit/exit/ctrl-C/ctrl-D - exit Calabash (jdoe) >After having set a data system context:
Calabash (jdoe:lake1) > help list i - list all infrastructure objects in the data system list r - list all readers in the data system list p - list all pipelines in the data system list w - list all writers in the data system desc i - see details of an infrastructure object desc r - see details of a reader desc p - see details of a pipeline desc w - see details of a writer deploy i - deploy an infrastructure object deploy r - deploy a reader deploy p - deploy a pipeline deploy w - deploy a writer undeploy i - undeploy an infrastructure object undeploy r - undeploy a reader undeploy p - undeploy a pipeline undeploy w - undeploy a writer set-up-ssl - create SSL key and certificate supported by a PCA set-up-kce - set up a Kafka client environment on this host to interact with the Kafka system at pca - get the access token of a PCA at r - get the access token of a reader reset-at - reset access token of a reader code-gen - generate pipeline code export-ds - export metadata according to the spec file import-ds - import from the metadata file list ds - list all data systems you can access desc ds - see details of the data system use ds - use a data system connect - connect to Calabash server to access metadata help - show this message quit/exit/ctrl-C/ctrl-D - exit Calabash (jdoe:lake1)>