The name of our writer is “bq-writer-1.” To create it, click on the “Writer” heading on the top of the Calabash GUI window. The Writer page becomes current. Then set “lake_finance” as the data system. Click on the big green “Create Writer” button.
Enter the following information in the writer creation form.
By now, you should be familiar with the schema editor. Use this editor, define the schemas for source records in the topic “aggregated-log1.”
Select “Google Bigquery” as the target type. As soon as “Google Bigquery” is selected, the remaining properties change to include those pertinent to Bigquery.
You need to define the project, dataset, and table name for the Bigquery table. Optionally you can also configure to use Bigquery streaming. In the tutorial, we unset that flag. We will use the transactional loader to save data to Bigquery. Streaming is faster, but there is no guarantee all data can get into the table, i.e., a best-effort service.
Next, define the target schema and mappings from the source schema to the target schema. Right now, the output schema is empty.
Click on the small blue plus sign to add a field to the output schema.
In the above, we define the first field in the output record, “account_id,” and set its data type to “string.” There is also an “output expression” to define. This expression defines from which input field we map the output field. Since the key of the input record is the account id, we select “key” from the drop-down list as the expression of “account_id.”
Other possibilities in the drop-down list include “value” for the entire value record, “value.count” for the count field in the value, etc.
Click on the “OK” button to return to the writer-editor. Similarly, add all other output fields to make the schema look like the following screenshot.
Finally, define the remaining properties for the writer.
The writer is idempotent, which means you can launch multiple writers to share the load. They will never interfere with each other.
The error log is a cloud storage file that can hold all records that failed to save in the target, together with error messages.