In this project you are to use Amazon Elastic Map Reduce (EMR) to implement the pageRank algorithm on a dataset

database to be used is wikipedia database and what needs to be done is

--> extracting the links database(The first thing you need to do is to extract the wikilinks from the xml file. )

-->Writing a Pig/Hive program to compute the page rank

OR, another option is, you choose your own project topic and dataset you want to work with and write project report on it. The size of the dataset that you choose should be at least several GBs(min 5gb) and you can use any Hadoop related project (including MapReduce, Pig, hive, Mahout, etc. ) to process your data. You need to describe the problem considered for your project and propose a solution approach using one of the tools/methodologies such as recommender systems, clustering, classification, etc.

The dataset could be the data you are using at work, some data that is involved in your everyday life, or any publicly available dataset.

Habilidades: Big Data Sales, Hadoop

Veja mais: writing implement, what is the algorithm, what is algorithm, what is a algorithm, what algorithm, use algorithm, what is hadoop, pig, pig on a, Hive, hive hadoop, hadoop, hadoop project, hadoop algorithm, extract map, EMR, Amazon XML, amazon rank, reduce page, writing wikipedia page, pagerank algorithm, writing project database, implement pagerank, reduce, amazon extract

Acerca do Empregador:
( 2 comentários ) kent, United States

ID do Projeto: #6789935

Concedido a:


Working as a Data Scientist , have experience on Map-Reduce framework , and Nosql db like hive ,pig. I haves some queries. 1. Do we need to install hadoop & pig/hive on your machine? 2 .Or else do you have aws ec2 Mais

$100 USD em 3 dias
(0 Comentários)

8 freelancers estão ofertando em média $248 para esse trabalho


I do big data related jobs everyday! I graduated from Carnegie Mellon University with a master degree. I have lots of industry experience in big data area. I worked at IBM, Twitter before. I know how to use hadoop 1 Mais

$350 USD in 3 dias
(1 Comentário)

I can deliver you the best solution at the lowest possible price . I will use pig for accomplishment of task which makes it easy to maintain

$111 USD in 5 dias
(0 Comentários)

6+ years of experience in machine learning and master degree holder in computer science. Expertise in R,WEKA, JAVA,PYTHON, Hadoop, MapReduce, Pig, Hive. Worked on many projects in machine learning.

$555 USD in 21 dias
(0 Comentários)

i already done many ranking projects using hive/pig scripting.. and done some POC's related to extracting data

$155 USD in 7 dias
(0 Comentários)

Machine learning scholar, My research area includes Data mining, NLP, Image processing. My past experience in machine learning includes 8+ year in machine learning research. I have expertise in R, Python, Hadoop, MapRe Mais

$333 USD in 10 dias
(0 Comentários)

Hi, I have 1+ year experience in Bigdata, Hadoop, Pig, Hive. I can do this work for you. Please consider me

$155 USD in 3 dias
(0 Comentários)

Already worked on this domain. Working on my PhD on Graph Based services (inc. pageRank, etc) Expert on Recommender systems, so you just need to findout your DataSet and express your needs

$222 USD in 7 dias
(0 Comentários)