Tag: dataframe

Merging pandas DataFrames generated with a loop on SQL Database Data

This works BUT the outputs are not matching on the index (Date). Instead the new columns are added but start at the first dataframes last row i.e. the data is stacked “on top” of each other so the Date index is repeated. Is there a way to iterate and create columns that are matched by Date? Output: Thanks! Answer Just

time data ‘(datetime.date(2021, 7, 30), )’ does not match format ‘%Y/%m/%d’

dataframe python python-datetime sql

I am accessing date from database using below query, in my jupyterLab notebook: it is giving this ValueError: time data ‘(datetime.date(2021, 7, 30), )’ does not match format ‘%Y/%m/%d’ can anyone guide, the correct way pls? Answer Seems like c_date is already a datetime.date object. You don’t need to cDate = str(c_date). try:

Select some datetime from base

dataframe datatable python sql

I have a 2 table in my base. The first table is reservation table. The start_ts and end_ts are time when start and end the reservation of desk (desk_id): [Reservation_table] The second is motion, which come from sensors motion. [Motion_table] This 2 tables are connecting that the sensors are start save to the base when someone come to desk (desk_id).

How to execute custom logic at pyspark window partition

apache-spark dataframe pyspark python sql

I have a dataframe in the format shown below, where we will have multiple entries of DEPNAME as shown below, my requirement is to set the result = Y at the DEPNAME level if either flag_1 or flag_2= Y, if both the flag i.e. flag_1 and flag_2 = N the result will be set as N as shown for DEPNAME=personnel

Processing mulitple similar rows in Pandas

dataframe pandas sql

I have a dataframe pulled from a relational database. A one-to-many join has resulted in many similar rows with one column different. I would like to combine the similar rows but have the differing column data contained within a list, for each unique row. I am also able to change the SQL but I think this may be easier to

SparkSQLContext dataframe Select query based on column array

apache-spark apache-spark-sql dataframe scala sql

This is my dataframe: I want to select all books where the author is Udo Haiber. but of course it didn’t work because authors is array. Answer You can use array_contains to check if the author is inside the array: Use single quotes to quote the author name because you’re using double quotes for the query string.

How I can select a column where in another column I need a specific things

apache-spark apache-spark-sql dataframe pyspark sql

I have a pyspark data frame. How I can select a column where in another column I need a specific things. suppose I have n columns. for 2 columns I have A. B. a b a c d f I want all column B. …

How do i write this in python and preferably in pandas?(Assume that i am dealing with a dataframe)

analytics dataframe pandas python sql

Better solution to index a DataFrame according to the values of 2 others

combinations dataframe pandas python sql

I would like to index a DataFrame (aaxx_df) according to the values of 2 others (val1_df for the columns and val2_df for the rows). I put below a solution that works for my problem, but I guess, there must be some much cleaner solutions, possibly via SQL (it seems to me to be very similar to a relational database problem).

how to join two hive tables with embedded array of struct and array on pyspark

dataframe hive pyspark python sql

I am trying to join two hive tables on databricks. tab1: The schema of “some_questions” “some_questions” example: tab2: I need to join tab1 and tab2 by “question_id” such that I get a new table I try to join them by pyspark. But, I am not sure how to decompose the array with embedded struct/array. thanks Answer For SparkSQL, you can