API Reference

intake_elasticsearch.elasticsearch_table.ElasticSearchTableSource(…) Data source which executes arbitrary queries on ElasticSearch
intake_elasticsearch.elasticsearch_seq.ElasticSearchSeqSource(query) Data source which executes arbitrary queries on ElasticSearch
class intake_elasticsearch.elasticsearch_table.ElasticSearchTableSource(*args, **kwargs)[source]

Data source which executes arbitrary queries on ElasticSearch

This is the tabular reader: will return dataframes. Nested return items will become dict-like objects in the output.

Parameters:
query: str

Query to execute. Can either be in Lucene single-line format, or a JSON structured query (presented as text)

npartitions: int

Split query into this many sections. If one, will not split.

qargs: dict

Further parameters to pass to the query, such as set of indexes to consider, filtering, ordering. See http://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.search

es_kwargs: dict

Settings for the ES connection, e.g., a simple local connection may be {'host': 'localhost', 'port': 9200}. Other keywords to the Plugin that end up here and are material:

scroll: str

how long the query is live for, default '100m'

size: int

the paging size when downloading, default 1000.

metadata: dict

Extra information for this source.

Attributes:
cache_dirs
datashape
description
hvplot

Returns a hvPlot object to provide a high-level plotting API.

plot

Returns a hvPlot object to provide a high-level plotting API.

plots

List custom associated quick-plots

Methods

close() Close open resources corresponding to this data source.
discover() Open resource and populate the source attributes.
read() Read all data in one go
read_chunked() Return iterator over container fragments of data source
read_partition(i) Return a part of the data corresponding to i-th partition.
to_dask() Turn into dask.dataframe
to_spark() Provide an equivalent data object in Apache Spark
yaml([with_plugin]) Return YAML representation of this data-source
set_cache_dir  
to_dask()[source]

Turn into dask.dataframe

class intake_elasticsearch.elasticsearch_seq.ElasticSearchSeqSource(query, npartitions=1, qargs={}, metadata={}, **es_kwargs)[source]

Data source which executes arbitrary queries on ElasticSearch

This is the sequential reader: will return a list of dictionaries.

Parameters:
query: str

Query to execute. Can either be in Lucene single-line format, or a JSON structured query (presented as text)

npartitions: int

Split query into this many sections. If one, will not split.

qargs: dict

Further parameters to pass to the query, such as set of indexes to consider, filtering, ordering. See http://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.search

es_kwargs: dict

Settings for the ES connection, e.g., a simple local connection may be {'host': 'localhost', 'port': 9200}. Other keywords to the Plugin that end up here and are material:

scroll: str

how long the query is live for, default '100m'

size: int

the paging size when downloading, default 1000.

metadata: dict

Extra information for this source.

Attributes:
cache_dirs
datashape
description
hvplot

Returns a hvPlot object to provide a high-level plotting API.

plot

Returns a hvPlot object to provide a high-level plotting API.

plots

List custom associated quick-plots

Methods

close() Close open resources corresponding to this data source.
discover() Open resource and populate the source attributes.
read() Read all data in one go
read_chunked() Return iterator over container fragments of data source
read_partition(i) Return a part of the data corresponding to i-th partition.
to_dask() Form partitions into a dask.bag
to_spark() Provide an equivalent data object in Apache Spark
yaml([with_plugin]) Return YAML representation of this data-source
set_cache_dir  
read()[source]

Read all data in one go

to_dask()[source]

Form partitions into a dask.bag