Skip to content
Advertisement

Tag: scala

create rows from columns in a apache spark dataset

I’m trying from a dataset to create a row from existing columns. Here is my case: InputDataset accountid payingaccountid billedaccountid startdate enddate 0011t00000MY1U3AAL 0011t00000MY1U3XXX 0011t00000ZZ1U3AAL 2020-06-10 00:00:00.000000 NULL And I would like to have sometthing like this accountid startdate enddate 0011t00000MY1U3AAL 2021-06-10 00:00:00.000000 NULL 0011t00000MY1U3XXX 2021-06-10 00:00:00.000000 NULL 0011t00000ZZ1U3AAL 2021-06-10 00:00:00.000000 NULL In the input dataset the columns billedaccounid and

Single quotes cause trouble while filtering in Slick

I have statements such as below and they fail with exceptions such as this I have tried to escape the single quote but wasn’t successful. When I tried to insert a record such as this: The exception I’ve gotten is: Please note that I am using H2 in Mysql mode to run my tests. Answer That error suggests to me

Finding largest number of location IDs per hour from each zone

I am using scala with spark and having a hard time understanding how to calculate the maximum count of pickups from a location corresponding to each hour. Currently I have a df with three columns (Location,hour,Zone) where Location is an integer, hour is an integer 0-23 signifying the hour of the day and Zone is a string. Something like this

SQL Database using JDBC + parameterize SQL Query + Databricks

In Databricks am reading SQL table as How can I parameterize SourceSystem and RuleCode in Where clause Was referring to: https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/sql-databases Answer if you import the spark implicits, you can create references to columns with the dollar $ interpolator. Also, you can use the API with columns to make the logic, it will be something like this. As you can

Advertisement