Labels

Android (1) bash (2) boost (2) C (34) C++ (2) cheatsheet (2) CLion (6) css (3) Debian (33) DL (17) Docker (2) Dreamweaver (2) Eclipse (3) fail2ban (4) git (5) GitHub (4) Hacking (3) html (8) http (1) iOS (1) iPad (1) IRC (1) Java (31) javascript (3) Linux (169) Mac (19) Machine Learning (1) mySQL (49) Netbeans (4) Networking (1) Nexus (1) OpenVMS (6) Oracle (1) Pandas (3) php (16) Postgresql (8) Python (9) raid (1) RedHat (14) Samba (2) Slackware (48) SQL (14) svn (1) tar (1) ThinkPad (1) Virtualbox (3) Visual Basic (1) Visual Studio (1) Windows (2)

Thursday, 27 December 2018

SQL in Jupyter Notebooks specifically PostgreSQL



import pandas as pd

# Run the database cleanup script (included for completeness)
!./sql_initial_state.py
#use ipython-sql magic to connect to the database
%load_ext sql
%load_ext schemadisplay_magic
#Connect to database
%sql postgresql://tm351_student:tm351_pass@localhost:5432/tm351_clean
·          %sql : states that we are making the connection with the ipython-sql extension
·          postgresql : tells ipython-sql that we will use PostgreSQL as our database engine
·          tm351_student : id of the user who is logging in
·          tm351_pwd : the user's password
·          localhost:5432 : the port on which the database engine is listening
·          tm351_clean : the name of the database

OR BELOW METHOD USING DB_CONNECTION VARIABLE

#setup variables in python
DB_ENGINE='postgresql' # tells ipython-sql that we will use postgreSQL
DB_USER='tm351_student'  # id of the user who is logging in
DB_PWD='tm351_pwd' # the user's password
DB_ADDR='localhost:5432' # the host and port
DB_NAME='tm351_clean' # the name of the database 

#Create connection String
DB_CONNECTION = '{engine}://{user}:{pwd}@{addr}/{name}'.format(engine=DB_ENGINE,
                                                               user=DB_USER,
                                                               pwd=DB_PWD,
                                                               addr=DB_ADDR,
                                                               name=DB_NAME)

OR

DB_CONNECTION = DB_ENGINE+'://'+DB_USER+':'+DB_PWD+'@'+DB_ADDR+'/'+DB_NAME 

#connect to database 
%sql $DB_CONNECTION


#Run a query
%%sql

SELECT *
FROM someTable
LIMIT 10; 

#Put query result into dataframe   %%sql variable name <<
#Example
 
%%sql my_df <<

SELECT *
FROM someTables
LIMIT 10; 


#To run SQL queries in python
from pandasql import sqldf
pysqldf = lambda q: sqldf(q, globals())

query = '''
SELECT *
FROM myTable;
'''

result = pysqldf(query) 

#CREATE TABLE

%%sql

CREATE TABLE patient (
   
    patient_id CHAR(4),
    patient_name VARCHAR(20),
    date_of_birth DATE,
    gender CHAR(6),
    height_cm DECIMAL(4,1),
    weight_kg DECIMAL(4,1),
    doctor_id CHAR(4),
   
    PRIMARY KEY (patient_id)
 ); 


#Pass connection string to schemadisplay_magic
%schema --connection_string $DB_CONNECTION
%schema --connection_string $DB_CONNECTION -t someTable

NOTE: when running query, use double quotes for column names but NOT for values, for values use only a single quote, example:

%%sql

SELECT *
        FROM ema
    WHERE "Location" = 'On or near Aylesford Drive'

    LIMIT 100;

No comments:

Post a Comment

Note: only a member of this blog may post a comment.