Rational subgroup sampling

12 posts / 0 new
Last post
alex.darger
Offline
Joined: 03/04/2019 - 15:46
Rational subgroup sampling

Hi, I'm trying to select 5 random samples for each subcategory in a database. I know I could extract all the records for each subcategory and make a random selection for each of those, but there are too many subcategories to do that efficiently. Does anyone know a way to use rational subgroup samping in IDEA?

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Have you tried using Stratified Random Sampling?  You could use each subcategory as a separate stratum and then select the number of samples items for each.

Here is a video on how to do this: https://www.youtube.com/watch?v=_oG0-I3kmTA

 

musfar
Offline
Joined: 07/06/2020 - 10:07

Hi
Is there a way to set Criteria inorder to avoid any duplicate fields while selecting stratified random samples. ie, duplicate amount or duplicate document number etc
 

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

If there are duplicates in your file then you should remove them before doing the stratification.  You could use the duplicate key detection to look for items that might have duplicates.  If you know there are duplicates you can check out the newest script in the IDEA Lab that looks for unique records, so it would return only the first value for any combination of keys, so all your items would then be unique based on the key.

musfar
Offline
Joined: 07/06/2020 - 10:07

Many thanks for the reply, I  was wondering where could this script be found and this IDEA Lab, your advice would be great and appreciated.
On first hand,  I tried setting criteria like Doc Num <> Doc Num O.R Supplier Name <> Supplier Name for selecting unique samples while doing numeric stratification, which is not working..  
Also, Is there any way we will get notifications for the specific forum chat and replies, sorry  if i'm being naive, thanks again Mr. Brian Element! 
 

musfar
Offline
Joined: 07/06/2020 - 10:07

@Brian
Hi Brian,
Really appreciate if you could have a look on this

Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57

Hi musfar,

Sorry for not getting back sooner but work has kept me away from the site.

I have updated your profile so you should receive emails when someone responds to a post you have commented on.

To access the IDEALab you need version 11.1 of IDEA.  In version 11.1 you will have a ribbon item called IDEA lab that will contain a link to access the site. 

You can also remove duplicates using the duplicate key:

Or you could summarize your file, that could be another option to help get rid of duplicates.

musfar
Offline
Joined: 07/06/2020 - 10:07

Hi Brian,

Many thanks for your reply! Really appreciate your time and consideration.

Just wanted to let you know, as I didn't receive any notification email for the comments or response received. Also, I'm already applying wherever possible, the duplicates and summarisation. Just wondering additionally if there is any script as you mentioned in your previous comment #4, which will select the unique records.

Finally, as you mentioned, I tried checking for IDEA Lab, but this is how its showing while trying to update to version 11.1 . Kindly advise

Images: 
Brian Element's picture
Brian Element
Offline
Joined: 07/11/2012 - 19:57
Hi Musfar, To get access to the IDEALab you need to upgrade to IDEA 11. Unfortunately I can't help you with that you will need to contact your IDEA distributor for that upgrade. Thanks Brian
musfar
Offline
Joined: 07/06/2020 - 10:07

Ok, got it! Thanks for the reply Brian

alex.darger
Offline
Joined: 03/04/2019 - 15:46

Ok, I think that should work. Is there a way to auto-populate the limits with all the subcategories? There are 78 I need to enter in manually and the dialog will only let me enter in one at a time. 

Pages