Create a Python script to analyze posting data stored in JSON format

I need help in analyzing data from around 90 online forums as part of a research project - I want to identify factors that predict the success of online forums.

Access to an Amazon AWS instance will be supplied as the files are over 30Gb. Work will need to be done via the Amazon AWS instance.

Ideally we will be able to use the Python code in an iPython notebook. I am flexible on this however.

The data comes nicely packaged in a set of JSON files:

[1] [url removed, login to view]

Contains badge data associated with user accounts - i.e. the user ID it was assigned to, the name of the badge, and the date/time it was assigned.

[2] [url removed, login to view]

Contains comments data - e.g. timestamp, user ID of the user writing the comment, score (upvotes/downvotes).

[3] [url removed, login to view]

Contains posting history data - e.g. posting revision type, details of the changes made, user ID associated with the user who made the change

[4] [url removed, login to view]

Contains posting data - e.g. the actual questions and answer text posted by users, tags associated with the questions/answers

[5] [url removed, login to view]

Contains user data - e.g. user ID, user account creation date, last access date, reputation score, etc.

[6] [url removed, login to view]

Contains voting data - e.g. vote type ID, timestamp of the voting action, user ID of the person making the vote

A data dictionary with more details is attached as is a sample copy of the JSON files (only containing 3 forums worth of data to make it easy to download - the real version has around 100 forums).

In brief -

[1] we will need to classify each of the forums into two groups: success and failure. I will tell you which ones failed and which ones succeeded.

[2] We then focus on the first 14 days of data for each forum.

[3] We then calculate metrics for each one - e.g. average number of posts per day, average number of answers per day, concentration of people doing all the work

[4] Make conclusions based on the results using a linear regression or other model

Habilidades: Ciência de Dados, Python, Estatísticas

Veja mais: writing sample format, writing notebook, writing groups online, writing dictionary, want a part time account, script writing work, python help online, online forum posting, notebook writing online, need help with statistics, i need a dictionary, concentration for writing, code python online, calculate factors of a number, Amazon votes, statistics python, python 3 online, online regression, online json format, python online, python forum, online writing accounts, online script writing, online python, accounts answers i need

Acerca do Empregador:
( 5 comentários ) Brantford, Australia

ID do Projeto: #6834516

18 freelancers estão ofertando em média $325 para esse trabalho


Hi I think it's better if you, 1. import the data in a relational database. 2. create an analysis and reporting interface around it (db). 3. an interactive environment to play with data. 4. some beforehand funct Mais

$700 USD in 8 dias
(26 Comentários)

A proposal has not yet been provided

$250 USD in 7 dias
(36 Comentários)

I suppose you have lots of options to select among core programmers, but I am a statistician as well. Also, I have assistants to help me speed up the projects.

$200 USD in 2 dias
(19 Comentários)

Hello, I'm using Python daily for most of my programming needs. I have experience with databases (PostgreSQL, MongoDB, MySQL) and relevant modules like json, multiprocessing, datetime. I have analyzed large files ( Mais

$240 USD in 3 dias
(5 Comentários)

Hi, We have a team of Data Mining and Web Scraping experts. We have worked on many Data Mining techniques including Association Rule Mining, Clustering, Outlier Mining, Sentiment Analysis etc extensively in the pas Mais

$1111 USD in 10 dias
(7 Comentários)

I am good at python/json process, and also interested in your project. Please contact me to discuss more details for your project, Thanks!

$222 USD in 5 dias
(10 Comentários)

La propuesta todavía no ha sido proveída

$250 USD in 7 dias
(11 Comentários)

Hi Good Day I have great experience both with JSON and Python. Also have experience with working on AWS instances. Willing to work on this project. Looking forward to hear from you. Thanks Rinsad

$263 USD in 5 dias
(3 Comentários)

A proposal has not yet been provided

$100 USD in 3 dias
(8 Comentários)

Hi, I am a python developer and have quite some experience. I am willing to complete the work in the required time. Please let me know if you are interested

$55 USD in 3 dias
(6 Comentários)

A proposal has not yet been provided

$133 USD in 7 dias
(5 Comentários)

Hi, I can help you with this, lots of experience with python and analyzing data (including using Ipython notebook) Thanks, Craig

$300 USD in 5 dias
(1 Comentário)

I have 4+ years of working experience in machine learning domain and have master degree in computer science. Worked on many projects mainly in Predictive analytics, Natural language language, text mining, web mining et Mais

$666 USD in 21 dias
(1 Comentário)

Dear Client, We have gone through given requirement detail and confident to deliver you best solution as we have expert in-house team of Python programmers who deliver best & bug free solution to our clients. We Mais

$222 USD in 5 dias
(1 Comentário)

I have experience in Python and data science. Also I have experience in reading data from JSON format. I am physicist. You can find my publications here [login to view URL] I kn Mais

$444 USD in 7 dias
(0 Comentários)

Hi, your project looks interesting. I think it falls into two parts: 1. simple data manipulation, choosing part of data by certain criteria. 2. statistics, using linear regression or other models to analyze the data, Mais

$155 USD in 5 dias
(0 Comentários)

Very interesting project, with emphasis on understanding the data. My knowledge of mathematics and experience in data design and manipulation makes ma a good choice. Please, contact me on if you want to know more ab Mais

$250 USD in 6 dias
(0 Comentários)

Hi, I have reviewed the requirements, we are a small team of Python/Django & MEAN developers working on node.js/ angularjs since 2010. here is few of our Python Work : [login to view URL] http:// Mais

$263 USD in 25 dias
(0 Comentários)

As I understand the project, it will just involve some interpretation of the data in a relational manner. I did study some statistics and relational techniques in school, and I know how to do it in Python. However, Mais

$222 USD in 10 dias
(0 Comentários)