mars.dataframe.read_sql_table¶
- mars.dataframe.read_sql_table(table_name, con, schema=None, index_col=None, coerce_float=True, parse_dates=None, columns=None, chunksize=None, test_rows=5, chunk_size=None, engine_kwargs=None, incremental_index=True, use_arrow_dtype=None, partition_col=None, num_partitions=None, low_limit=None, high_limit=None)[source]¶
Read SQL database table into a DataFrame.
Given a table name and a SQLAlchemy connectable, returns a DataFrame. This function does not support DBAPI connections.
- Parameters
table_name (str) – Name of SQL table in database.
con (SQLAlchemy connectable or str) – A database URI could be provided as as str. SQLite DBAPI connection mode not supported.
schema (str, default None) – Name of SQL schema in database to query (if database flavor supports this). Uses default schema if None (default).
index_col (str or list of str, optional, default: None) – Column(s) to set as index(MultiIndex).
coerce_float (bool, default True) – Attempts to convert values of non-string, non-numeric objects (like decimal.Decimal) to floating point. Can result in loss of Precision.
parse_dates (list or dict, default None) –
List of column names to parse as dates.
Dict of
{column_name: format string}where format string is strftime compatible in case of parsing string times or is one of (D, s, ns, ms, us) in case of parsing integer timestamps.Dict of
{column_name: arg dict}, where the arg dict corresponds to the keyword arguments ofpandas.to_datetime()Especially useful with databases without native Datetime support, such as SQLite.
columns (list, default None) – List of column names to select from SQL table.
chunksize (int, default None) – If specified, returns an iterator where chunksize is the number of rows to include in each chunk. Note that this argument is only kept for compatibility. If a non-none value passed, an error will be reported.
test_rows (int, default 5) – The number of rows to fetch for inferring dtypes.
chunk_size (: int or tuple of ints, optional) – Specifies chunk size for each dimension.
engine_kwargs (dict, default None) – Extra kwargs to pass to sqlalchemy.create_engine
incremental_index (bool, default True) – If index_col not specified, ensure range index incremental, gain a slightly better performance if setting False.
use_arrow_dtype (bool, default None) – If True, use arrow dtype to store columns.
partition_col (str, default None) – Specify name of the column to split the result of the query. If specified, the range
[low_limit, high_limit]will be divided inton_partitionschunks with equal lengths. We do not guarantee the sizes of chunks be equal. When the value is None,OFFSETandLIMITclauses will be used to cut the result of the query.num_partitions (int, default None) – The number of chunks to divide the result of the query into, when
partition_colis specified.low_limit (default None) – The lower bound of the range of column
partition_col. If not specified, a query will be executed to query the minimum of the column.high_limit (default None) – The higher bound of the range of column
partition_col. If not specified, a query will be executed to query the maximum of the column.
- Returns
A SQL table is returned as two-dimensional data structure with labeled axes.
- Return type
See also
read_sql_queryRead SQL query into a DataFrame.
read_sqlRead SQL query or database table into a DataFrame.
Notes
Any datetime values with time zone information will be converted to UTC.
Examples
>>> import mars.dataframe as md >>> md.read_sql_table('table_name', 'postgres:///db_name')