Converting a short postgreSQL query into spark scala
$30-250 AUD
Concluído
Publicado há mais de 3 anos
$30-250 AUD
Pago na entrega
Hi, I'm having quite a bit of trouble converting this job into a spark scala job - I've made attempts but don't think it's quite correct. So there's two scripts and I'd really love help with this ASAP. Pretty sure it uses the to_json function in scala but not sure how to fully structure it. It's the array_to_json, array_agg, json_build_object that I don't know how to convert. Happy to give more details as needed including the full schema of the origin dataframes
SCRIPT 1:
-- final_df_1
INSERT INTO final_df_1
SELECT
d.id_A,
d.id_B,
json_build_object('things',[login to view URL]) AS data
FROM (
SELECT
id_A,
id_B,
COALESCE(array_to_json(array_agg(json_build_object(
'id_A',id_A,
'id_B',id_B,
'description', description,
'latitude', latitude,
'longitude', longitude,
'latitude_2', latitude_2,
'code_number', code_number :: json,
))), '[]') AS things
FROM df_1
GROUP BY
id_A,
id_B
) d;
SCRIPT 2:
-- final_df_2
INSERT INTO final_df_2
SELECT
d.phone_number,
d.location_id,
row_to_json(d) AS data
FROM (
SELECT
f.phone_number,
f.location_id,
f.item_ids::json AS item_ids,
(
SELECT
COALESCE(array_to_json(array_agg(row_to_json(fh))), '[]')
FROM (
SELECT
id_number,
creation_date,
subcase::json AS subcase
FROM df_origin
WHERE phone_number = f.phone_number
ORDER BY creation_date DESC
) fh
) fault_history
FROM df_add f
) d;
Would really appreciate help really soon, and happy to pay $50-70AUD for this.
Effort so far:
FOR DF1
val itemViewDf = readDf("df_1")
val final_df_1 = [login to view URL]("id_A", "id_B",
to_json(Seq(struct(/*all columns in df_1 */))).as("data"))
.groupBy("id_A", "id_B")