Rational subgroup sampling
Forums
Hi, I'm trying to select 5 random samples for each subcategory in a database. I know I could extract all the records for each subcategory and make a random selection for each of those, but there are too many subcategories to do that efficiently. Does anyone know a way to use rational subgroup samping in IDEA?
If there are duplicates in
If there are duplicates in your file then you should remove them before doing the stratification. You could use the duplicate key detection to look for items that might have duplicates. If you know there are duplicates you can check out the newest script in the IDEA Lab that looks for unique records, so it would return only the first value for any combination of keys, so all your items would then be unique based on the key.
Many thanks for the reply, I
Many thanks for the reply, I was wondering where could this script be found and this IDEA Lab, your advice would be great and appreciated.
On first hand, I tried setting criteria like Doc Num <> Doc Num O.R Supplier Name <> Supplier Name for selecting unique samples while doing numeric stratification, which is not working..
Also, Is there any way we will get notifications for the specific forum chat and replies, sorry if i'm being naive, thanks again Mr. Brian Element!
Hi musfar,
Hi musfar,
Sorry for not getting back sooner but work has kept me away from the site.
I have updated your profile so you should receive emails when someone responds to a post you have commented on.
To access the IDEALab you need version 11.1 of IDEA. In version 11.1 you will have a ribbon item called IDEA lab that will contain a link to access the site.
You can also remove duplicates using the duplicate key:
Or you could summarize your file, that could be another option to help get rid of duplicates.
Hi Brian,
Hi Brian,
Many thanks for your reply! Really appreciate your time and consideration.
Just wanted to let you know, as I didn't receive any notification email for the comments or response received. Also, I'm already applying wherever possible, the duplicates and summarisation. Just wondering additionally if there is any script as you mentioned in your previous comment #4, which will select the unique records.
Finally, as you mentioned, I tried checking for IDEA Lab, but this is how its showing while trying to update to version 11.1 . Kindly advise
Have you tried using
Have you tried using Stratified Random Sampling? You could use each subcategory as a separate stratum and then select the number of samples items for each.
Here is a video on how to do this: https://www.youtube.com/watch?v=_oG0-I3kmTA