Подключение к Fabric Lakehouses & Warehouses из кода Python — Сэм Дебрюйн
Пусть Fabric научит вас программировать с помощью Data Wrangler — Сэм Дебрюйн
schedule
01.11.2023
#P13#
#P14# #P15# #P16# #P17# #P18## Split text using string '–' in column: 'YEAR'
df_split = df_clean['YEAR'].str.split(pat='–', expand=True).add_prefix('YEAR_')
df_clean = pd.concat([df_clean.iloc[:, :2], df_split, df_clean.iloc[:, 2:]], axis=1)
df_clean = df_clean.drop(columns=['YEAR'])
# Rename column 'YEAR_0' to 'year'
df_clean = df_clean.rename(columns={'YEAR_0': 'year'})
# Rename column 'MOVIES' to 'title'
df_clean = df_clean.rename(columns={'MOVIES': 'title'})
# Rename column 'YEAR_1' to 'year_end'
df_clean = df_clean.rename(columns={'YEAR_1': 'year_end'})
# Split text using string ',' in column: 'GENRE'
df_split = df_clean['GENRE'].str.split(pat=',', expand=True).add_prefix('GENRE_')
df_clean = pd.concat([df_clean.iloc[:, :4], df_split, df_clean.iloc[:, 4:]], axis=1)
df_clean = df_clean.drop(columns=['GENRE'])
# Rename column 'GENRE_0' to 'genre_0'
df_clean = df_clean.rename(columns={'GENRE_0': 'genre_0'})
# Rename column 'GENRE_1' to 'genre_1'
df_clean = df_clean.rename(columns={'GENRE_1': 'genre_1'})
# Rename column 'GENRE_2' to 'genre_2'
df_clean = df_clean.rename(columns={'GENRE_2': 'genre_2'})
# Remove leading and trailing whitespace in columns: 'genre_0', 'genre_1', 'genre_2'
df_clean['genre_0'] = df_clean['genre_0'].str.strip()
df_clean['genre_1'] = df_clean['genre_1'].str.strip()
df_clean['genre_2'] = df_clean['genre_2'].str.strip()
# Rename column 'RATING' to 'rating'
df_clean = df_clean.rename(columns={'RATING': 'rating'})
# Rename column 'VOTES' to 'vote_count'
df_clean = df_clean.rename(columns={'VOTES': 'vote_count'})
# Sort by column: 'rating' (descending)
df_clean = df_clean.sort_values(['rating'], ascending=[False])
#P19#