Skip to content

API Docs - v1.0.2

Tested Siddhi Core version: 5.1.14

It could also support other Siddhi Core minor versions.

Sink

azuredatalake (Sink)

Azure Data Lake Sink can be used to publish (write) event data which is processed within siddhi to file in Azure Data Lake.
Siddhi-io-azuredatalake sink provides support to write both textual and binary data into files.

Syntax

@sink(type="azuredatalake", account.name="<STRING>", account.key="<STRING>", blob.container="<STRING>", add.to.existing.blob.container="<BOOL>", recreate.blob.container="<BOOL>", file.path="<STRING>", append="<BOOL>", add.line.separator="<BOOL>", @map(...)))

QUERY PARAMETERS

Name Description Default Value Possible Data Types Optional Dynamic
account.name

Azure storage account name to be used. This can be found in the storage account's settings under 'Access Keys'

STRING No No
account.key

Azure storage account key to be used. This can be found in the storage account's settings under 'Access Keys'

STRING No No
blob.container

The new or existing container name to be used.

STRING No No
add.to.existing.blob.container

This flag will be used to mention adding to an existing container with the same name is allowed.

true BOOL Yes No
recreate.blob.container

This flag will be used to recreate if there is an existing container with the same name.

false BOOL Yes No
file.path

The absolute file path to the file in the blob container.

STRING No No
append

This flag is used to specify whether the data should be append to the file or not.
If append = 'true', data will be write at the end of the file without changing the existing content.
If append = 'false', if given file exists, existing content will be deleted and then data will be written back to the file.
It is advisable to make append = 'true' when dynamic urls are used.

true BOOL Yes No
add.line.separator

This flag is used to specify whether events added to the file should be separated by a newline.
If add.event.separator= 'true', then a newline will be added after data is added to the file.
If the @map type is 'csv' the new line will be added by default

true BOOL Yes No

Examples EXAMPLE 1

@sink(
type='azuredatalake', 
account.name='wso2datalakestore', 
account.key='jUTCeBGgQgd7Wahm/tLGdFgoHuxmUC+KYzqiBKgKMt26gGp1Muk2U6gy34A3oqogQ4EX3+9SGUlXKHQNALeYqQ==', 
blob.container='samplecontainer', 
file.path='parentDir/subDir/events.txt', 
@map(type='csv')
)
Define stream BarStream (symbol string, price float, volume long);

Under above configuration, a file(events.txt) will be created(if not exists) in the storage account 'wso2datalakestore' in the parentDir/subDir/ path.
For each event received to the sink, it will get appended to the file in csv formatoutput will looks like below.
WSO2,55.6,100
IBM,55.7,102

Source

azuredatalake (Source)

Azure Data Lake Source can be used to read event data from files in Azure Data Lake and feed data to Siddhi.

Syntax

@source(type="azuredatalake", account.name="<STRING>", account.key="<STRING>", blob.container="<STRING>", dir.uri="<STRING>", file.name="<STRING>", bytes.to.read.from.file="<LONG>", tailing.enabled="<BOOL>", action.after.process="<STRING>", move.after.process="<STRING>", waiting.time.to.load.container="<LONG>", time.interval.to.check.directory="<LONG>", time.interval.to.check.for.updates="<LONG>", @map(...)))

QUERY PARAMETERS

Name Description Default Value Possible Data Types Optional Dynamic
account.name

Azure storage account name to be used. This can be found in the storage account's settings under 'Access Keys'

STRING No No
account.key

Azure storage account key to be used. This can be found in the storage account's settings under 'Access Keys'

STRING No No
blob.container

The container name of the Azure Data Lake Storage.

STRING No No
dir.uri

The absolute file path to the file in the blob container to be read.

STRING No No
file.name

The absolute file path to the file in the blob container to be read.

STRING No No
bytes.to.read.from.file

The number of bytes to be read at once.

32768 LONG Yes No
tailing.enabled

The extension will continously check for the new content added to the files when this parameter is set to 'true'.

false BOOL Yes No
action.after.process

The action which should be carried out
after processing a file in the given directory.
It can be either delete, move or keep and default value will be 'DELETE'.
If the action.after.process is MOVE, user must specify the location to move consumed files using 'move.after.process' parameter.

delete STRING Yes No
move.after.process

If action.after.process is MOVE, user must specify the location to move consumed files using 'move.after.process' parameter.
This should be the absolute path of the file that going to be created after moving is done.
This has to be the relative path from the file system excluding the file system name.

STRING Yes No
waiting.time.to.load.container

Extension will wait the mentioned time in this parameter until the blob container becomes available, before exiting with the SiddhiAppCreationException.

5000 LONG Yes No
time.interval.to.check.directory

Extension will check for new files in the given directory periodically with the given interval.

1000 LONG Yes No
time.interval.to.check.for.updates

Extension will continuously check for file updates in file/s periodically with the given interval when 'tailing.enabled' is set to 'true'.

1000 LONG Yes No

Examples EXAMPLE 1

@sink(
type='azuredatalake', 
account.name='wso2datalakestore', 
account.key='jUTCeBGgQgd7Wahm/tLGdFgoHuxmUC+KYzqiBKgKMt26gGp1Muk2U6gy34A3oqogQ4EX3+9SGUlXKHQNALeYqQ==', 
blob.container='samplecontainer', 
file.path='parentDir/subDir/events.txt', 
@map(type='csv')
)
Define stream BarStream (symbol string, price float, volume long);

Under above configuration, a file(events.txt) will be created(if not exists) in the storage account 'wso2datalakestore' in the parentDir/subDir/ path.
For each event received to the sink, it will get appended to the file in csv formatoutput will looks like below.
WSO2,55.6,100
IBM,55.7,102