Tag: data-manipulation

Creating a VIEW to remove duplicates from table based on max date

data-manipulation database snowflake-cloud-data-platform sql

I have a table that appends data each day and records the imported date, however it appends duplicates. My end goal here is to remove the duplicates based on the lowest imported column date. Here would be the initial state of that table: TABLE CLIENTS Name Surname Imported Bob John 18-07-2022 Marta White 18-07-2022 Ryan Max 18-07-2022 Bob John 20-07-2022

Find total IDs between two dates that satisfies a condition

data-manipulation dataframe python r sql

I have a dataset PosNeg like this. I need to find count of ID’s who have a pattern like this- P N P P or N P N N P N – that is having at least one N (negative) between two P’s(positive). If this pattern occurs at least once, then count that ID. Date is always in ascending order.

R: “Fuzzy Match” and “Between” Statements

data-manipulation fuzzy-logic join r sql

I am working with the R Programming Language. I have the following tables (note: all variables appear as “Factors”): I am trying to “join” (e.g. inner join) this tables on the following conditions: 1) if table_1$id “fuzzy equal” table_2$id AND 2) if table_1$date BETWEEN(table_2$date_2,table_2$date_3) I tried to write the following code in R to do this: Question: But I am

Take last value in sequence

azure-sql-database data-manipulation database sql sql-server

I am trying to insert data into my target table from my source table where in the target table I have an additional column called SaleTo. SaleTo = the SaleFrom based on the MAX SaleSequence. Example of the source table: SaleNo SaleFrom SaleSequence 1 Alabama 2 1 Minnesota 1 1 Virginia 3 Example of target table: SaleNo SaleFrom SaleSequence SaleTo

How can I change vartypes of a SQL table from R using ODBC package?

data-manipulation odbc r sql

This works, however I want to change the sepal.width to decimal(28,0). Is it possible to do it before writing to SQL, or can I modify the SQL table to change the column type from R? I know I can use RODBC, but I am forced to use R Version 3.6, so it is not an option. Answer Look for field.types

SQL/Presto: how to rank within a subgroup of each group

data-manipulation presto sql trino

I have a table like the following: i want to rank user_id within subgroup of each group by the score and then by time (earlier better) each user_id gets. so the desired output is Answer Use rank(): Actually, I’m not sure if higher scores are better than lower ones, so you might want score asc.

Manipulate data in SQL (backfilling, pivoting)

amazon-redshift data-manipulation sql

I have a table similar to this small example: I want to manipulate it to this format: Here’s a sample SQL script to create an example input table: CREATE TABLE sample_table ( id INT, hr INT, …

Split Full Name with Format: {Last, First Middle} Comprehensive Cases

data-manipulation data-quality sql sql-server string

My client sent me name data as a Name string which includes the last, first, and middle names in a single entry. I need them split into LastName, FirstName, and MiddleName. I have found some scripts online, but they don’t serve my purposes because they either (1) use a different format, or (2) don’t handle edge cases very well. See