API Reference

intake_elasticsearch.elasticsearch_table.ElasticSearchTableSource(…)

Data source which executes arbitrary queries on ElasticSearch

intake_elasticsearch.elasticsearch_seq.ElasticSearchSeqSource(query)

Data source which executes arbitrary queries on ElasticSearch

class intake_elasticsearch.elasticsearch_table.ElasticSearchTableSource(*args, **kwargs)[source]

Data source which executes arbitrary queries on ElasticSearch

This is the tabular reader: will return dataframes. Nested return items will become dict-like objects in the output.

Parameters
query: str

Query to execute. Can either be in Lucene single-line format, or a JSON structured query (presented as text)

npartitions: int

Split query into this many sections. If one, will not split.

qargs: dict

Further parameters to pass to the query, such as set of indexes to consider, filtering, ordering. See http://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.search

es_kwargs: dict

Settings for the ES connection, e.g., a simple local connection may be {'host': 'localhost', 'port': 9200}. Other keywords to the Plugin that end up here and are material:

scroll: str

how long the query is live for, default '100m'

size: int

the paging size when downloading, default 1000.

metadata: dict

Extra information for this source.

Attributes
cache_dirs
classname
datashape
description
has_been_persisted
hvplot

Returns a hvPlot object to provide a high-level plotting API.

is_persisted
plot

Returns a hvPlot object to provide a high-level plotting API.

plots

List custom associated quick-plots

Methods

close(self)

Close open resources corresponding to this data source.

discover(self)

Open resource and populate the source attributes.

export(self, path, \*\*kwargs)

Save this data for sharing with other people

persist(self[, ttl])

Save data from this source to local persistent storage

read(self)

Read all data in one go

read_chunked(self)

Return iterator over container fragments of data source

read_partition(self, i)

Return a part of the data corresponding to i-th partition.

to_dask(self)

Turn into dask.dataframe

to_spark(self)

Provide an equivalent data object in Apache Spark

yaml(self[, with_plugin])

Return YAML representation of this data-source

get_persisted

set_cache_dir

to_dask(self)[source]

Turn into dask.dataframe

class intake_elasticsearch.elasticsearch_seq.ElasticSearchSeqSource(query, npartitions=1, qargs={}, metadata={}, **es_kwargs)[source]

Data source which executes arbitrary queries on ElasticSearch

This is the sequential reader: will return a list of dictionaries.

Parameters
query: str

Query to execute. Can either be in Lucene single-line format, or a JSON structured query (presented as text)

npartitions: int

Split query into this many sections. If one, will not split.

qargs: dict

Further parameters to pass to the query, such as set of indexes to consider, filtering, ordering. See http://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.search

es_kwargs: dict

Settings for the ES connection, e.g., a simple local connection may be {'host': 'localhost', 'port': 9200}. Other keywords to the Plugin that end up here and are material:

scroll: str

how long the query is live for, default '100m'

size: int

the paging size when downloading, default 1000.

metadata: dict

Extra information for this source.

Attributes
cache_dirs
classname
datashape
description
has_been_persisted
hvplot

Returns a hvPlot object to provide a high-level plotting API.

is_persisted
plot

Returns a hvPlot object to provide a high-level plotting API.

plots

List custom associated quick-plots

Methods

close(self)

Close open resources corresponding to this data source.

discover(self)

Open resource and populate the source attributes.

export(self, path, \*\*kwargs)

Save this data for sharing with other people

persist(self[, ttl])

Save data from this source to local persistent storage

read(self)

Read all data in one go

read_chunked(self)

Return iterator over container fragments of data source

read_partition(self, i)

Return a part of the data corresponding to i-th partition.

to_dask(self)

Form partitions into a dask.bag

to_spark(self)

Provide an equivalent data object in Apache Spark

yaml(self[, with_plugin])

Return YAML representation of this data-source

get_persisted

set_cache_dir

read(self)[source]

Read all data in one go

to_dask(self)[source]

Form partitions into a dask.bag