We have answer of your question!

100% solved queries, no empty question

Question: Spark- how to count names by first character of the name from spark registered table


0

Advertisement


I am trying to run below piece of spark code on pyspark and getting error. Could you please help me to understand what is missing?

p1 = pd.DataFrame(final_data,columns = ['Year','Name','Sex','Count'])      
h1 = sqlContext.createDataFrame(p1)         
h1.registerTempTable('namesdb')             
sqlContext.sql("select SUBSTR(Name, 1, 1) as char1, count(Name) FROM namesdb group by char1 order by char1 ASC").toPandas()    

But I am getting below error :

AnalysisException: u"cannot resolve 'char1' given input columns: [Year, Name, Sex, Count];

Here are the sample records for final_data

final_data[:2]        

[[1880, 'Mary', 'F', '7065'],      
 [1880, 'Anna', 'F', '2604']
Question author Ytasfeb15 | Source

Answer


1


Advertisement


In SQL you cannot use the assigned column name 'as char1' in the group by clause, but you can just repeat the function in your group by clause like this:

select SUBSTR(Name, 1, 1) as char1, count(Name) FROM namesdb group by SUBSTR(NAME,1,1) order by char1 ASC
Answer author Camaris

Advertisement


Tickanswer.com is providing the only single recommended solution of the question Spark- how to count names by first character of the name from spark registered table under the categories i.e apache-spark , pyspark , spark-dataframe , . Our team of experts filter the best solution for you.

Related Search Queries:

You may also add your answer!