AskHYS.net Q x Q – Online Data Query System Instructions

CONTENTS

What is Q x Q
What Survey Data Can You Use?
Getting Started with Your Analysis
Running Crosstabs with HYS Data
Testing for Significant Differences
Resources
Frequently Asked Questions
More Examples

What is Q x Q?

QXQ is a query system on AskHYS.net, a website for the Washington State Healthy Youth Survey. Using the QXQ you can find answers to questions from the Healthy Youth Survey, question by question---Q by Q, Q X Q. The “X” or “by” also means that you can run crosstabs---that is, create a table with, for instance, smoking in the rows “by” drinking in the columns. You’ll see what we mean….

What Can You Do with the Q x Q?

The QXQ allows you to run frequencies and crosstabs with Healthy Youth Survey data.

What Survey Data Can You Use?

Anyone can use the state sample, and most county data, for every year (2002, 2004, 2006, and 2008) and every grade in which the survey was implemented (6th, 8th, 10th, and 12th grades).

TWO IMPORTANT NOTES:
  1. You can run an analysis on only one year and one grade at a time.
  2. State, county, and ESD results are available to everyone. But access to school district and school building data requires that you log on with a password:

Logging On

Does my school/school district have data?

What Survey Questions are Available

The questions available on the Q x Q depend on the year of the survey, and the grade level.

Getting Started with Your Analysis

STEP 1: Choose Initial Analysis Variables

Using the drop down boxes at the top of the Q x Q, select year, grade, gender and geography (or location):

Image of the dropdown boxes

STEP 2: Select Survey Questions

First: Click the drop-down menu “Select a survey data category…”

Image of the dropdown boxes

Next: Select a category, and then a sub-category

Image of the dropdown box

The survey questions that relate to alcohol access will appear in the drop down menu below.

NOTE: The numbers in parenthesis on the left correspond to the dataset variable name in parentheses. Those numbers can help you navigate in the Survey Question Crosswalk. (See FAQ about finding survey questions

Image of the select box

When you click on a question title, in this case, Access to Alcohol, the “Item response category preview” box to the right will have the survey question and its response options. This is a good way to verify that you are looking at the question you really want.

Image of the select box

TIP: You can also “right click” on a question title to bring up a pop-up box. Click on “See all years’ descriptions” and it will provide the question, response options and the years it was asked.

STEP 3: Drag and Drop Questions into Analysis Boxes

Image of the select box Dragging and Dropping
Click and hold on the question you want to analyze, and drag it into the analysis box labeled “Drop first variable (row) here….”

OR:

Image of the select box You can also right click on the question and you’ll get a pop up with the option to “Include in analysis”. Then select “row” or “column”.

STEP 4: Choose Response Options

The box to the right will show you the response options for the question. “Surveyed” will give you all of the options that were available to the respondent. “Collapsed” reduces the response options into only two responses

TIP: Why would you use the “collapsed” categories? By combining the answers of two or more responses, you also combine the number of respondents in the “collapse” categories. This will be important when you run crosstabs because you MUST have a minimum number of respondents in each cell of your results table. [There is more about “cells” in the next section on crosstabs.]

For [M10] Access to Alcohol:

Collapsed Surveyed
Image of the control box Image of the control box

STEP 5: Run Query

Click the “run query” button at the bottom right side of the screen. Image of the submit button

EXAMPLE: In this example, we selected: year 2008, grade 8, location State WA, and the question [M10] Access to Alcohol. For the response options we chose “surveyed”.

Image of a report

The results include the percentage of students who chose this option, the 95% confidence interval and the number of respondents for each response option. For more information about confidence intervals and how to use them, see the Frequently Asked Questions section of this document.

Changing to a New Analysis: Drag, Click, or Reset

To change to a different question:

To use the same question but for a different grade:

EXCEPTION: If you want to run 6th grade, you will have to re-select your question.

Running Crosstabs with HYS Data

What is a Crosstab?

A crosstab allows you to see the relationship between two different questions from HYS. In the example below, we see the relationship between drinking and smoking: 95.4% of youth who reported NO drinking on any day(s) in the past month did not smoke (cell 1), but 35.5% who drank also report smoking (cell 4).

Image of a crosstab report

Requirements for Crosstabs

**NOTE**: The squares that have results in them are the “cells”, and the number of respondents in each cell is called the “n”, or also the “cell size”.

Cell Size

If you don’t have enough respondents per cell, you will receive the following error message:

“At least one cell in the results table contained a count of less than 10.“ Output is suppressed.”

Cell size limitations can be frustrating. In the example above, we were using the state sample, so the n’s are large. If you were running a crosstab for a small district or school the number of responses in each cell would be much smaller, so you may not get a report. If your results are suppressed because of cell size, try this:

Survey Form: A, B, or C

The secondary school version of the survey has two distinct versions: Form A and Form B. In each classroom, half the students get Form A, and half get Form B. Most of the survey questions are either on Form A or Form B. For a crosstab, both questions have to come from the same form---you can’t cross a question that is only on Form A with another question that is only on Form B.

For example, you cannot cross current cigar smoking (only on form B) with current methamphetamine use (only on form A). You will receive the following error message:

“No surveys contained responses to all the selected variables”.

About 35 questions are identical on both versions. For those questions, like 30-day cigarette use or 30-day alcohol use, you can run crosstabs with any question on either survey. This is important because Form A has most of the risk and protective factors, and Form B has most of the physical and mental health questions.

Running a Crosstab: STEPS 1-3

To run a crosstab, you use the menus the same way that you would for a frequency, but now you chose two questions instead of one.

  1. STEP 1: Drag or insert your first question in the “Drop first variable (row) here” analysis box.
  2. STEP 2: Drag or insert your second question in the “Drop second variable (column) here” analysis box.
  3. STEP 3: Select response options for both questions.

In this example, we have current alcohol drinking as the first variable, and current cigarette smoking as the second variable. For both questions we have selected the collapsed response options.

Image of an example crosstab selection

Running a Crosstab: STEP 4

Hit the Run Query button.

Image of an example crosstab report

Interpreting Crosstab Results

Read your results by the row, not by the column. Notice that the “Total” for each row is 100%. So this means that the top row has results for all of the students (100%) who reported no alcohol drinking.

The bottom row has results for all of the students who reported at least one day of drinking.

You can’t read the columns in the same way!

If you want to know how many smokers drink, re-run your crosstab. Select:

TIP: Another way to say this, the report we ran is about student non-drinkers and student drinkers. And we are asking if they smoke. If you want to find out if the smokers and non-smokers drink, well you have to re-run the crosstab, and put smoking in the rows, and drinking in the columns.

TIP: The Table Preview in the upper right hand corner can give you an idea about how your analysis will turn out. (The table uses variable number rather than the whole item name.)

Image of table preview In this example, [D20] Current Alcohol Drinking will appear in the rows, and [D14] Current Cigarette Smoking will be in the columns. So your results will tell you how many drinkers smoke cigarettes.


Testing for Significant Differences

When an analysis you have done is interesting or important for your work, you may want to discover if the differences you have found between two groups is “statistically significant”. This is important when you are using data from a sample, and want to generalize to a larger population.

If the difference between two groups (say, the difference in smoking rates between non-drinkers and drinkers in the example above) is “statistically significant” it means that we feel confident that the difference is not due to chance alone. That confidence is expressed as a probability. A commonly used probability, 95%, can be interpreted as “we are 95% confident that the difference is significant”, or that chance could explain the difference only 5% of the time.

You can assess significance using confidence intervals, but it is more precise to determine significance mathematically, and you can do it easily. Go to http://www.hys.wa.gov/Reporting/Default.aspx, scroll down to 3rd bullet under Information and Tools, “Excel Tool for Determining Statistical Significance”. All you have to do is enter your data---the percentage, and the plus-or-minus (the margin of error, ±) into the boxes.

EXAMPLE:
To test for differences in drinking soda between 10th and 12th graders in 2008, statewide:
Year: 2008 Grade: 10th Gender: Both Location: State
First Variable: Nutrition – Junk Food - [H09] Soda Drinking (collapsed)
Results = 2 or more sodas for 10th grade: 15.3% (±1.7)
Year: 2008 Grade: 12th Gender: Both Location: State
First Variable: Nutrition – Junk Food - [H09] Soda Drinking (collapsed)
Results = 2 or more sodas for 12th grade: 14.7% (±1.9)
Image of a statistical significance report

Interpretation: P-value is greater than 0.05, so there is no difference in drinking 2 or more sodas between 10th and 12th graders, in 2008 statewide.

Test for differences in drinking soda between 10th grade males and females in 2008, statewide:
Year: 2008 Grade: 10th Gender: Male Location: State
First Variable: Nutrition – Junk Food - [H09] Soda Drinking (collapsed)
Results = 2 or more sodas for 10th grade: 20.5% (±2.3)
Year: 2008 Grade: 10th Gender: Female Location: State
First Variable: Nutrition – Junk Food - [H09] Soda Drinking (collapsed)
Results = 2 or more sodas for 12th grade: 10.5% (±1.9)
Image of a statistical significance report

Interpretation: P-value is less than 0.05, so 10th grade males are more likely to drink 2 or more sodas compared to 10th grade females, in 2008 statewide.

If you want to learn more about statistical significance, including how to interpret the confidence intervals in your QXQ results, go to the Frequently Asked Questions in this handout, or to page 36 in the HYS training workbook, at http://www.hys.wa.gov/Workshops/2008/HYS09WorkbookFinal.pdf.

Resources

https://fortress.wa.gov/doh/hys

http://www.hys.wa.gov http://www.AskHYS.net

For help with the AskHYS.net Q x Q, contact: Susan Richardson: sue.richardson@doh.wa.gov

Frequently Asked Questions

Did my School Participate?

The results are displayed in the following order: state, ESD, county, district, school building and then consortia*. School district and building results are arranged by county, so the districts in Adams County are first and those in Yakima County are last.

Select the year of results you want and scroll or search to find your ESD, county, district or school:

Why do I Have to Start Over Every Time I Want to Change my Analysis to the Sixth Grade?

Each time you chose a different survey year or select 6th grade rather than a secondary school grade, the available questions will change.

How do I Find the Survey Question I Want?

There are a lot of questions on HYS and many categories and sub-topics to choose from. If you are having difficulty finding the question you want, try using the Survey Question Crosswalk. You can find that this website: https://fortress.wa.gov/doh/hys/SurveyQuestions.htm. We will soon add that to the AskHYS website.

The Master tab of the crosswalk includes the actual HYS question, the sub-topic, and the title. You can search for a specific word to find your question and where to find it in the drop downs.

The columns on the right of the spreadsheet include the variable names for 2008, 2006, 2004, and 2002. These columns not only tell you the variable names, but let you know if the question was asked that year. If a question title is missing for a specific year that means the question was not asked then.

There are also year tabs: 2008, 2006, 2004, and 2002:

TIP: If you have the question you want, but simply want to know if it is available in other years, in the QXQ dropdown list you can “right click” on a question title to bring up a pop-up box. Click on “See all years’ descriptions” and it will provide the question, response options and the years it was asked.

Knowing which form the question was on can be helpful in determining:

I can’t find the question and data for “overweight”.

The results for “overweight” are not available at the school building level. (An WSIRB requirement because it is a physically identifiable characteristic)

What are “Confidence Intervals” and How should I Use Them?

AskHYS results include a ± number after each item estimate—this number is a confidence interval. A confidence interval accounts for the fact that the reported value is probably a little different than the true value for all of the students. A 95% confidence interval, for example, means that we are 95% confident that the true value is within the ± range. Confidence intervals are important when you generalize results to a larger population.

Why do we need confidence intervals if data are valid?

Confidence intervals account for variability among students, NOT the validity of the data.

  • Variability is inherent in any population worth studying. If variability were not a factor, administering a survey to answer questions would not be necessary.
  • Variability causes uncertainty in the results. Confidence intervals allow for the comparison of results to others and to ourselves over time.

EXAMPLE: Smoked cigarettes (Grade 10, 2008): 14.4% (± 1.6%). This means that the true estimate of 10th grade smoking rates is 14.4%, plus or minus 1.6%. If you do the math, that means 14.4 – 1.6 = 12.8%, and 14.4 + 1.6 = 16.0%. So, the true estimate is between 12.8% and 16.0%. Putting Your Healthy Youth Survey Results to Work 35

Why are confidence intervals different sizes?

The size of a confidence interval is effected by:

  • Number of students. In general, the more students surveyed, the smaller the confidence interval.
  • Inherent variability. If most students answer a survey question in the same way, then there is less variability. The more variable the answers, the wider the confidence intervals.
  • Level of confidence. HYS uses 95% confidence intervals. This percentage is commonly used, but results can be calculated for different percentages. If 80% confidence were desired, the confidence interval would be smaller. If 99% confidence were desired, the confidence interval would be larger.
  • Sampling design.

What is “Statistical Significance”, and When Does It Matter?

Statistical significance means that the probability that differences in results are not due to chance alone. When using 95% confidence intervals, a difference between two groups is considered statistically significant if chance could explain it only 5% of the time or less.

Confidence intervals can help you quickly determine significant differences, but there are more precise ways to determine significance. We will be showing you a tool later that can help with this. Also, assistance determining statistical significance is available from many sources including the local health department, the local ESD, JSPC agencies, or the Internet.

Sample of a significant difference between state and local data:
  • Smoked cigarettes in the state: 14.4% (± 1.6%)
    Interpret as between 12.8% and 16.0%.
  • Smoked cigarettes at my school: 20.0% (± 2.0%)
    Interpret as between 18.0% and 22.0%.

Conclusion: The highest value for the state (16.0%) and the lowest value for the school (18.0%) do not overlap; thus the difference IS statistically significant.

Sample of a nonsignificant difference between state and local data:
  • Smoked cigarettes in the state: 14.4% (± 1.6%)
    Interpret as between 12.8% and 16.0%.
  • Smoked cigarettes at my school: 20.0% (±10.0%)
    Interpret as between 10.0% and 30.0%.

Conclusion: At least one confidence interval (10.0 to 30.0%) overlaps the other point estimate (14.4%), thus the difference is NOT statistically significant.

Sample of an inconclusive difference between state and local data:
  • Smoked cigarettes in the state: 14.4% (± 1.6%)
    Interpret as between 12.8% and 16.0%.
  • Smoked cigarettes at my school: 20.0% (± 5.0%)
    Interpret as between 15.0% and 25.0%

Conclusion: Inconclusive, more testing required (confidence intervals overlap each other but not the point estimates).

Can I Compare my QXQ Results to Other Reports?

You can compare the Q x Q results with the 2008 Statewide Grade 8 Report of Results (from RMC), available online at: https://fortress.wa.gov/doh/hys under Reports and Response Rates:

152. If you wanted to get some beer, wine, or hard liquor (for example,
vodka, whiskey, or gin), how easy would it be for you to get some?
State
(n = 4,310)
a. Very hard 36.2% (± 1.9%)
b. Sort of hard 26.9 (± 1.4)
c. Sort of easy 20.8 (± 1.4)
d. Very easy 16.1 (± 1.4)

Notice that there are some slight differences between the Q x Q results and the Report of Results. For Very Hard, the Q x Q results is 36.1% ±2.0 and in the Report of Results it is 36.2% ±1.9. These discrepancies can result from the way in which different statistical programs round numbers. If the discrepancy is small, it is not a problem.

To make sure the differences are just due to rounding, check the number of respondents for each response option. Notice that the Report of Results n is 4,310. If you add up the respondents from the Q x Q analysis (1,558 + 1,160 + 897 + 695) you also get an n of 4,310.

Why Can’t I Get Results because of the “Cell size”?

The cell-size rule is designed to protect the anonymity of students taking the HYS, and is required the Washington State Institutional Review Board (WSIRB).

More Examples

Frequencies by Year:

In 2008, what was the rate of current alcohol use among 8th graders across the state?
Year: 2008 Grade: 8th Gender: Both Location: State
First Variable: Alcohol – Current Use – [D20] Current Alcohol Drinking (collapsed)
Report results
Interpretation:
  • 16.1% (±1.6) of 8th graders were current alcohol drinkers in 2008 across the state.
In 2006, what was the rate of current alcohol use among 8th graders across the state?
Year: 2006 Grade: 8th Gender: Both Location: State
First Variable: Alcohol – Current Use – [D20] Current Alcohol Drinking (collapsed)
Report results
Interpretation:
  • Notice that the drop down menu resets because different questions were asked in 2006.
  • 15.4% (±1.9) of 8th graders were current alcohol drinkers in 2006 across the state.

Frequencies by Grade:

In 2006, what was the rate of current alcohol use among 6th graders across the state?
Year: 2006 Grade: 6th Gender: Both Location: State
First Variable: Alcohol – Current Use – [D20] Current Alcohol Drinking (collapsed)
Report results
Interpretation:
  • Notice that the drop down menu resets because 6th graders were asked different questions than 8th, 10th and 12th graders.
  • 4.3% (±0.7) of 6th graders were current alcohol drinkers in 2006 across the state.

Crosstabs by Gender:

In 2008, what was the rate of current alcohol use among 8th girls across the state?
Year: 2008 Grade: 8th Gender: Female Location: State
First Variable: Alcohol – Current Use – [D20] Current Alcohol Drinking (collapsed)
Report results
Interpretation:
  • Notice that this doesn’t look like a crosstab (doesn’t have 4 cells), but it is because you crossed a question by gender – so the rules for crosstabs apply.
  • 16.4% (±2.0) of 8th grade girls were current alcohol drinkers in 2008 across the state.

Crosstabs by Race/Ethnicity:

In 2008, what were current alcohol use rates among 8th graders by race/ethnicity across the state?
Year: 2008 Grade: 8th Gender: Both Location: State
First Variable: Demographics – Race/Ethnicity – [G06] Race/Ethnicity (collapsed)
Second Variable: Alcohol – Current Use – [D20] Current Alcohol Drinking (collapsed)
Report results
Interpretation:
  • With variable G06, respondents who only selected one race/ethnicity response option are included in that Race/Ethnicity. Any respondent that selected more than one response option is counted as “multiracial”.
  • Notice by using G06 collapsed – Asians and Pacific Islanders are combined and the multiracial and other race respondents are combined.
  • 22.3% (±4.0) of Hispanic/Latino 8th graders were current alcohol drinkers in 2008 across the state.
In 2008, what was the rate of current alcohol use among Hispanic/Latino 8th graders across the state?
Year: 2008 Grade: 8th Gender: Both Location: State
First Variable: Demographics – Race/Ethnicity – [hispanic] Hispanic or Latino/Latina, Only
Second Variable: Alcohol – Current Use – [D20] Current Alcohol Drinking (collapsed)
Report results
Interpretation:
  • Notice that by using the Race Only variable that you replicate the result from [G06]. Hispanic or Latino/Latina, Only includes respondents that only selected the Hispanic or Latino/Latina response option.
  • 22.3% (±4.0) of Hispanic/Latino 8th graders were current alcohol drinkers in 2008 across the state
In 2008, what was the rate of current alcohol use among Hispanic/Latino 8th graders across the state?
Year: 2008 Grade: 8th Gender: Both Location: State
First Variable: Demographics – Race/Ethnicity – [G06d Hispanic or Latino/Latina, Any]
Second Variable: Alcohol – Current Use – [D20] Current Alcohol Drinking (collapsed)
Report results
Interpretation:
  • Notice that by using the Race Any variable that you get different result from [G06]. Hispanic or Latino/Latina, Any includes respondents that selected the Hispanic or Latino/Latina as at least one of their response options. So respondents who selected two or more response options, for example, Hispanic or Latino/Latina and American Indian/Alaska Native, are included.
  • Notice that the Total n for “all or part Hispanic” 982 is bigger than “only Hispanic” 811.
  • 22.8% (±3.2) of Hispanic/Latino 8th graders were current alcohol drinkers in 2008 across the state.

Additional Crosstabs

Example bullying and depression:

In 2008, were 10th graders who reported depression more likely to be bullied?
Year: 2008 Grade: 10th Gender: Both Location: State
First Variable: Mental Health – Depression – [H53] Depression in the last 12 months
Second Variable: School Climate and Safety – Bullying and Harassment – [C01] Bullying (collapsed)
Report results
Interpretation:
  • Among 10th graders who were depressed, 32.7% (±2.4) were bullied.
  • Among 10th graders who were not depressed, 18.6% (±1.3) were bullied.
  • Using confidence intervals, you can tell that 10th graders who were depressed were more likely to be bullied compared to those who were not depressed, statewide in 2008.

Example eating breakfast and academic achievement (grades)

In 2008, were 10th graders who ate breakfast, more likely to get better grades in school across the state?
Year: 2008 Grade: 10th Gender: Both Location: State
First Variable: Nutrition – Breakfast – [H84] Breakfast Eating
Second Variable: School Success – Academic Achievement – [S17] Grades Last Year (collapsed)
Report results
Interpretation:
  • Among 10th graders who ate breakfast, 75.4% (±2.8) got mostly A’s and B’s in school.
  • Among 10th graders who did not eat breakfast, 59.0% (±3.7) got mostly A’s and B’s.
  • Using confidence intervals, you can tell that 10th graders who at breakfast were more likely get better grades in school compared to those who didn’t eat breakfast, statewide in 2008.