I have an Excel spreadsheet table like this where it shows how many units were sold each week for 2019 and 2020 (columns) and I am trying to put it into an access database formatted like the second table:
| model | 2019 - Wk 1 U's | 2019 - Wk 2 U's | 2019 - Wk 3 U's | ... | 2020 - Wk 52 U's | --------------------------------------------------------------------------------------- abc 50 45 65 20 def 72 93 57 68
and I want to get it like this
| model | year | month | week | units | --------------------------------------- abc 2019 1 1 50 abc 2019 1 2 45 abc 2019 1 3 65 def 2019 1 5 72 def 2019 1 6 93 def 2019 1 7 57 ... .... .. .. .. abc 2020 12 52 20 def 2020 12 52 68
What’s the best way to do this in Python (pandas) or VBA/Excel formula (or SQL if need be)?
Advertisement
Answer
Just need to be systematic
- row index
set_index("model")
- name column series
df.columns.set_names("weekencoded", inplace=True)
- then you are ready to
stack()
- rename column to units,
reset_index()
so encoded string is available as columns - parse out the values you want from encoded string
- I’ve left it for you to convert a week number to a month number
data = """| model | 2019 - Wk 1 U's | 2019 - Wk 2 U's | 2019 - Wk 3 U's | ... | 2020 - Wk 52 U's | --------------------------------------------------------------------------------------- abc 50 45 65 20 def 72 93 57 68""" df = pd.DataFrame([[c for c in l.split(" ") if c!=""] for l in data.split("n")[2:]], columns=[c.strip() for c in data.split("n")[0].split("|") if c!="" and "..." not in c]) df.set_index("model", inplace=True) df.columns.set_names("weekencoded", inplace=True) df = df.stack().to_frame().rename(columns={0:"units"}).reset_index().assign( year=lambda dfa: dfa.apply(lambda s: re.match("20[1,2][0-9]", s["weekencoded"])[0], axis=1), week=lambda dfa: dfa.apply(lambda s: re.findall(r"(^.*Wk )([0-9]+)( U.*$)", s["weekencoded"])[0][1], axis=1), )
output
model weekencoded units year week abc 2019 - Wk 1 U's 50 2019 1 abc 2019 - Wk 2 U's 45 2019 2 abc 2019 - Wk 3 U's 65 2019 3 abc 2020 - Wk 52 U's 20 2020 52 def 2019 - Wk 1 U's 72 2019 1 def 2019 - Wk 2 U's 93 2019 2 def 2019 - Wk 3 U's 57 2019 3 def 2020 - Wk 52 U's 68 2020 52