Pyspark get related records from its array object values (Child values)
$15-25 USD / hora
I have a spark dataframe that has an ID column and along with other columns, it has an array column that contains the IDs of its related records, as its value.
example dataframe will be of
ID | NAME | RELATED_IDLIST
--------------------------
123 | mike | [345,456]
345 | alen | [789]
456 | sam | [789,999]
789 | marc | [111]
555 | dan | [333]
893 | chad | null
From the above, I need to append all the related child Id's to the array column of the parent ID. The resultant DF should be like
ID | NAME | RELATED_IDLIST
--------------------------
123 | mike | [345,456,789,999,111]
345 | alen | [789,111]
456 | sam | [789,999,111]
789 | marc | [111]
555 | dan | [333]
893 | chad | null
-- I am trying to implement in Pyspark
need help figuring out the above req.
ID do Projeto: #28253757
Sobre o projeto
Concedido a:
Hi, I have 4 years of experience as a Big data engineer. I worked in PySpark for 3 years. I think I will be a good match. I can complete work in 1 to 2 hours. Please message me. Regards, Vamsi
7 freelancers estão ofertando em média $20/hora nesse trabalho
I'm a senior programer in Python. I read the job details extremely carefully. I think I can help you to complete your project with my rich knowledge and experience. I'm ready to do it now. Should you require furthe Mais
Hi, how are you? I'm data scientist specializing in: R / Python / Spark / Hadoop/ TensorFlow/ SQL / PowerBi. I need an example of the dataset and an example of the result you expect. I would like to know more in det Mais
hi I'm a spark developer good at python and pyspark. please let me know if you are interested. we will discuss.
I can very well help you with this project. Here's a small intro about me - I am a professional Data Engineer who has very good real time experience in working with most of the latest big data technologies like Scala, Mais