Tag: pandas

How to fix the error: ValueError: endog and exog matrices are different sizes

I’m trying to write a program in python that uses a query in SQL to collect data and make a regression model. When I try to actually create the model, however, it gives me this error. I’m pretty sure that I know what is going wrong, but I have no idea how to fix it. I’ve tried several things already,

adding many to many relations is producing an empty queryset

django geodjango pandas postgresql sql

I have a model called Occurrence with a number of foreign key and manytomany relationships (i’ve only shown a few fields here). I am having issues when attempting to create any manytomany relation to an Occurrence instance after I have loaded bulk data to my postgres (postgis) database using the sqlalchemy engine. Before this upload of data I can make

PostgreSQL equivalent of Pandas outer merge

merge outer-join pandas postgresql sql

I am trying to do in Postgres the equivalent of Pandas outer merge, in order to outer merge two tables. Table df_1 contains these data: Table df_2 contains these data: So Table df_1 has one extra column (random_id) than df_2. Also, job_id 1711418 and worker_id 45430 exist in both df_1 and df_2. If I use the “outer merge” method in

Processing mulitple similar rows in Pandas

dataframe pandas sql

I have a dataframe pulled from a relational database. A one-to-many join has resulted in many similar rows with one column different. I would like to combine the similar rows but have the differing column data contained within a list, for each unique row. I am also able to change the SQL but I think this may be easier to

How do i write this in python and preferably in pandas?(Assume that i am dealing with a dataframe)

analytics dataframe pandas python sql

Is there a way to run posqresql queries in a pandas dataframe?

pandas pandas-groupby postgresql sql time-series

I have pandas dataframe like this : created_at lat long hex_ID 0 2020-10-13 15:12:18.682905 28.690628 77.323285 883da1ab0bfffff 1 2020-10-12 22:49:05.886170 28.755408 77.112289 883da18e87fffff 2 2020-10-13 15:24:17.692375 28.690571 77.323335 883da1ab0bfffff 3 2020-10-12 23:21:13.700226 28.589922 77.082738 883da112a1fffff 4 2020-10-13 15:43:58.887592 28.649227 77.339063 883da1a941fffff and I want to convert it like this created_at hex_id count 0 2020-10-28 22:00:00 883da11185fffff 4 1 2020-09-09 10:00:00

Better solution to index a DataFrame according to the values of 2 others

combinations dataframe pandas python sql

I would like to index a DataFrame (aaxx_df) according to the values of 2 others (val1_df for the columns and val2_df for the rows). I put below a solution that works for my problem, but I guess, there must be some much cleaner solutions, possibly via SQL (it seems to me to be very similar to a relational database problem).

declare variable with pd.read_sql_query

pandas pyodbc python sql sql-server

Can someone explain why do I get an error when executing the following simple query with pandas: import pyodbc import pandas as pd connstr = ‘Driver={SQL Server}; Server=sr1; Database=db’ conn = …

Add datetime column with values based on another datetime column

google-bigquery pandas sql

I have a table: Using SQL language (BigQuery dialect) I need to add one column date_today_max, such that it copies all data from date column, but for records with the latest date (meaning max(date)) it will replace date with current_date: with Python+Pandas I’d achieve similar with but I have no clue how to tackle this with SQL. There is a

Dropping the index column from DataFrame in a .csv file in Pandas

csv pandas python sql

I have a python script here: and when I run it, it creates separate csv files that are formatted in sql. The output looks like this in generatedfile2: The rest of the files have this same format. Is there any way I can change my code to get rid of the “2” at the beginning of the code? It won’t