Skip to content

ENH: support index=True for SQL io.sql.get_schema #9084

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
jorisvandenbossche opened this issue Dec 15, 2014 · 2 comments
Open

ENH: support index=True for SQL io.sql.get_schema #9084

jorisvandenbossche opened this issue Dec 15, 2014 · 2 comments
Labels
Enhancement IO SQL to_sql, read_sql, read_sql_query

Comments

@jorisvandenbossche
Copy link
Member

pd.io.sql.get_schema can be used to see the schema 'create table' statement. This function should support including the index (which is actually the default for to_sql)

In [31]: df = pd.DataFrame({'col':[1,2,3]}, index=pd.date_range('2012-01-01', periods=3))

In [33]: print pd.io.sql.get_schema(df, 'test_schema', con=engine_postgres)

CREATE TABLE test_schema (
    col BIGINT
)

In [34]: print pd.io.sql.get_schema(df.reset_index(), 'test_schema', con=engine_postgres)

CREATE TABLE test_schema (
    index TIMESTAMP WITHOUT TIME ZONE, 
    col BIGINT
)
@trbs
Copy link

trbs commented Apr 18, 2016

I can create a PR for this if wanted.

Would we want the patch (i assume so) to be backwards compatible with index=False as the default for get_schema or have the PR conform to the default index=True value as is used in other places else. (meanly to_sql)

vrajmohan pushed a commit to vrajmohan/pandas that referenced this issue Jan 30, 2019
Closes pandas-dev#9084

- Decided to keep the default as index=False to keep the API consistent. `to_sql` has index=True.
- Tempted to name the parameter `include_dataframe_index` as "index" has
a different meaning in a SQL context.
vrajmohan pushed a commit to vrajmohan/pandas that referenced this issue Jan 30, 2019
Closes pandas-dev#9084

- Decided to keep the default as `index=False` to keep the API consistent. `to_sql` has `index=True`.
- Tempted to name the parameter `include_dataframe_index` as "index" has
a different meaning in a SQL context.
vrajmohan pushed a commit to vrajmohan/pandas that referenced this issue Feb 1, 2019
Closes pandas-dev#9084

- Decided to keep the default as `index=False` to keep the API consistent. `to_sql` has `index=True`.
- Tempted to name the parameter `include_dataframe_index` as "index" has
a different meaning in a SQL context.
@fangchenli
Copy link
Member

After adding the index option, the returned default "index" name has double quotes. Is this expected?

df = pd.DataFrame({
            "one": [1, 2, 3], "two": [1, 2, 3]
        }, index=list("abc"))

schema_with_index = sql.get_schema(df, 'test', index=True, con=self.conn)

CREATE TABLE test (
	"index" TEXT, 
	one BIGINT, 
	two BIGINT
)

df.index.name = 'new_index'
schema_with_index_rename = sql.get_schema(df, 'test', index=True, con=self.conn)

CREATE TABLE test (
	new_index TEXT, 
	one BIGINT, 
	two BIGINT
)

@jreback jreback modified the milestones: Contributions Welcome, 1.1 Jul 2, 2020
@jreback jreback modified the milestones: 1.1, Contributions Welcome Jul 10, 2020
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Enhancement IO SQL to_sql, read_sql, read_sql_query
Projects
None yet
6 participants