I defined student_sub_set dataframe as below:
# select the subset of characteristics for the regression
student_sub_set = student[['acad_lang_home', 'absent_freq','tired_freq','sex',
'bullying','like_math', 'clear_math',
'disorder_math', 'confident_math', 'value_math',
'like_science', 'clear_science','confident_science', 'value_science','study_support',
'parent_edu_max', 'internet_access',
'desired_edu',
'parent_immig_1', 'mmat_avg', 'ssci_avg']].dropna()
when I run student_sub_set.info() I get this output:
Int64Index: 2565 entries, 1 to 4573
Data columns (total 21 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 acad_lang_home 2565 non-null category
1 absent_freq 2565 non-null category
2 tired_freq 2565 non-null category
3 sex 2565 non-null object
4 bullying 2565 non-null category
5 like_math 2565 non-null category
6 clear_math 2565 non-null category
7 disorder_math 2565 non-null category
8 confident_math 2565 non-null category
9 value_math 2565 non-null category
10 like_science 2565 non-null category
11 clear_science 2565 non-null category
12 confident_science 2565 non-null category
13 value_science 2565 non-null category
14 study_support 2565 non-null category
15 parent_edu_max 2565 non-null category
16 internet_access 2565 non-null float64
17 desired_edu 2565 non-null category
18 parent_immig_1 2565 non-null float64
19 mmat_avg 2565 non-null float64
20 ssci_avg 2565 non-null float64
dtypes: category(16), float64(4), object(1)
memory usage: 162.9+ KB
Then I defined x_stud as below:
X_stud = student_sub_set[['acad_lang_home', 'absent_freq','tired_freq','sex', 'bullying','like_math', 'clear_math', 'disorder_math', 'confident_math', 'value_math', 'like_science', 'clear_science','confident_science', 'value_science','study_support', 'parent_edu_max', 'internet_access', 'desired_edu', 'parent_immig_1']]
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2565 entries, 1 to 4573
Data columns (total 45 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 internet_access 2565 non-null float64
1 parent_immig_1 2565 non-null float64
2 acad_lang_home_Sometimes 2565 non-null uint8
3 acad_lang_home_Almost always 2565 non-null uint8
4 acad_lang_home_Always 2565 non-null uint8
5 absent_freq_Once every two month 2565 non-null uint8
6 absent_freq_Once a month 2565 non-null uint8
7 absent_freq_Once every two weeks 2565 non-null uint8
8 absent_freq_Once a week 2565 non-null uint8
9 tired_freq_Sometimes 2565 non-null uint8
10 tired_freq_Almost every day 2565 non-null uint8
11 tired_freq_Every day 2565 non-null uint8
12 sex_Male 2565 non-null uint8
13 bullying_About Monthly 2565 non-null uint8
14 bullying_About Weekly 2565 non-null uint8
15 like_math_Somewhat Like Learning Mathematics 2565 non-null uint8
16 like_math_Very Much Like Learning Mathematics 2565 non-null uint8
17 clear_math_Moderate Clarity of Instruction 2565 non-null uint8
18 clear_math_High Clarity of Instruction 2565 non-null uint8
19 disorder_math_Some Lessons 2565 non-null uint8
20 disorder_math_Most Lessons 2565 non-null uint8
21 confident_math_Somewhat Confident in Mathematics 2565 non-null uint8
22 confident_math_Very Confident in Mathematics 2565 non-null uint8
23 value_math_Somewhat Value Mathematics 2565 non-null uint8
24 value_math_Strongly Value Mathematics 2565 non-null uint8
25 like_science_Somewhat Like Learning Science 2565 non-null uint8
26 like_science_Very Much Like Learning Science 2565 non-null uint8
27 clear_science_Moderate Clarity of Instruction 2565 non-null uint8
28 clear_science_High Clarity of Instruction 2565 non-null uint8
29 confident_science_Somewhat Confident in Science 2565 non-null uint8
30 confident_science_Very Confident in Science 2565 non-null uint8
31 value_science_Somewhat Value Science 2565 non-null uint8
32 value_science_Strongly Value Science 2565 non-null uint8
33 study_support_Either Own Room or Internet Connection 2565 non-null uint8
34 study_support_Both Own Room and Internet Connection 2565 non-null uint8
35 parent_edu_max_Lower Secondary 2565 non-null uint8
36 parent_edu_max_Upper Secondary 2565 non-null uint8
37 parent_edu_max_Post-secondary but not University 2565 non-null uint8
38 parent_edu_max_University or Higher 2565 non-null uint8
39 desired_edu_ISCED Level 2 2565 non-null uint8
40 desired_edu_ISCED Level 3 2565 non-null uint8
41 desired_edu_ISCED Level 4 2565 non-null uint8
42 desired_edu_ISCED Level 5 2565 non-null uint8
43 desired_edu_ISCED Level 6 2565 non-null uint8
44 desired_edu_ISCED Level 7 2565 non-null uint8
dtypes: float64(2), uint8(43)
memory usage: 167.8 KB
what is difference between them? I can not figure out why type of columns of this two dataframes are not as the same of each other!. I wached this code alot but I can not figure out the differnces between them. can anyone tell me the cause of this difference?
x_stud.info()
in our second output since none of the columns you give fromstudent_sub_set
is apparent. – Bumbling