Skip to main content

Joining Tables when key is between two values (upper and lower)

Hello, this is my first post on this forum as I am a recent IDEA user.
This question has been crackling my brain recently without any success.
I am trying to calculate the tax retention of multiple employees.
Tax Retention tables have an upper and a lower limit, and 2 features. For the two features I can simply do a join matches only but is there any way to do a key join between a range?
Tax retention table example on the right side.Employee Info on the left.
I want to extract the tax% value for each employee based on the features and its value.
My employee list as about 2700 records and there are 1600 records for the tax retension table.
 
 
 
 
 

Brian Element Fri, 06/30/2017 - 14:03

Hi myrahz, welcome to the site and welcome to IDEA.

That is a good question and thanks for supplying an example that I can use.  I created IDEA files out of your data to walk you through how I would go about it.

The problem here is you want to do a join and one of the fields, namely the Value field, doesn't have an exact match in the Tax Retention file but is based on a range with an upper and lower value.  So the first step is to create a field that you can match on.  What I did was create a virtual numeric field in each file called TAX_RANGE.  In your Tax Retention file I used a compif for the calculation, this is fine for smaller files but if you have a large number of Upper and Lower ranges you probably want to do a summary by Upper and Lower range and then add the key to that new file.

So in this example this is what the field looks like, it will now return 1 if it is between 100 and 200, 2 for 200 and 300 and 3 for 300 and 400.

So the new file looks like:

I then do something similar to the Employee file except instead of equals I use >= or <

So anything between 100 and 200 gets a 1 and 200 and 300 gets 2.  The file now looks like this:

I then do the join based on the Class, Level and Tax Range.

And for the Tax Retention file I only select the Tax_perc field.

Now the final file looks like this:

Hopefully this will help you out.  The biggest thing will be creating the Tax Range field, I used a comp_if statement but if the file is large (which is my guess) you are probably best off doing a summary and then creating the Tax Range based off of that summary.

Brian

myrahz Mon, 07/03/2017 - 05:12

In reply to by Brian Element

Thank you for your help Brian Element, though I should have been more specific on my case. I had already thought about that approach but there are 1200 different ranges, as they are not shared among the different classifications and levels.
Each person has 6 different classifications(based of being married, if the parter works or not, and if any of the two have any disability), and 6 different level combinations(number or childs between 0 and 5+).
Instead of the dummy file I will give you the real table, as it is public on the internet, but I worked on it a bit to make it easier for IDEA.
I'll try scotchy33's script, thanks scotchy33, and see what I can do.
I've thought of an approach, in theory, but I foundit to be a bit time consuming so I thought there would be an easier way, hence I asked.
Since there are 6 classifications, 6 levels and then around ~33.3 different ranges for each, I will create 36 different tables (combination of classification and level). And then it is easier to do a compif.
Thanks once again for all your help, Brian Element and scotchy33.
 
 
 

scotchy33 Fri, 06/30/2017 - 15:22

Here is a script that I wrote that should do what you want.  You will have to do classes and levels seperately.  Extract out in both files class A level 1, class A level 2 (key value extraction), etc and run the script in relation to class A level 1, class A level 2 ect.  Dummy files are included to test.

myrahz Mon, 07/03/2017 - 10:47

In reply to by scotchy33

I had some troubles and after 2 hours I was able to figure out that all the fields need to be same lenght , I had some numeric with 4 and others with 5. Had to convert all numeric to char of same lenght and it worked flawlessly after that. Thanks once again for the script