How to store "smaller than", equals and "larger than" in database
Asked Answered
H

4

6

I need to store scientific information in a database (sql server). What is the best way to store a value in a database where "smaller than", larger than" is part of the information.

Example:

PatientId: 123 Mutation rate: <3%

PatientId: 999 Mutation rate: 3%

PatientId: 456 Mutation rate: 10%

I need to be able to sort and filter the data. A mutation rate of <3% is better than 3%

How can I sove this?

Thank you for your help

Hecklau answered 30/11, 2009 at 22:30 Comment(4)
Is a mutation rate of "4.691%" better than "<4.7%"?Rising
I would store what you use to calculate the mutation rate, in order to accommodate changes in ranking.Tuck
the <4.7 % is used since the accuracy of the measurement is not high at the borders "it is not 0 but to low to pin a number on it" in the sample you have given you would measure 4.691 but save <4.7 % if this is the lowest value measureable, but 0 is also a value.Hecklau
Yes, but how would you sort it? Mark's question was about your sorting method: is 4.691 better than "<4.7"? Are you saying that you'd convert 4.691 into <4.7? Meaning that 4.691 would be the same as <4.7?Aged
U
9

How about adding a 3rd column for a clarifying int?

0 = Less than
1 = Less than or equal to
2 = Equal to
3 = Greater than or equal to
4 = Greater than

Unrestrained answered 30/11, 2009 at 22:35 Comment(3)
If you were to do this but reorder it to 0=Less than, 1=Less than or equal to, 2=Equal to, 3=Greater than or equal to, and 4=Greater than, you could do an order by on this with the other column to get the desired results.Suburb
In the math optimization world, it is custom to note that A > B <=> -B < -A, so you could even keep only 3 values in the columns and cover all cases: <, <=, =.Tankard
or you could just use the actual < > <=, =, >= and store them as text (varcahr) as long as you properly escape these values. Create far less confusion than trying to remember what 3 equals to and what 4 equels to. AVOID creating new conventions if older ones will suffice.Avis
F
1

The easiest, is typically to use predefined values for these cases, for example here, a value of 3.0 means 3% whereby 2.99 means "less than 3%".

Since these "Less than" and "More than" values typically only apply at the ends of the range, such a convention allows handling all filtering and ordering with a single field value,in a standard fashion. The main drawback to this approach is that it implies hard-coding these limit values at the level of the application, for display purposes and such.

The alternative is to introduce a two columns value, with a numeric value and a "qualifier" column which contains a code indicating "exact value" or "Less than" or "more than". While this approach appears to be more generic (and it is, at the level of the display etc.), some hard-coding is often required at the level of the "input".

Fingernail answered 30/11, 2009 at 22:35 Comment(2)
But shouldn't such limits be in the application? The database is for storing data. The business logic applies rules to the data and the UI shows the data to the user in a meaningful format.Mealie
@Corin, Yes, such limits and generally implicit info about the scale etc. should be handled at the application level.Fingernail
A
0

Unless you want to break up data into 3 fields like less_equal_more(varchar) (<.=, > or text equivalents) , measurement (deciaml) (3.5 or whatever is your level of precision) and units_of_measurment(varchar) (like percentage, or absolute in mile or meters - whateever) - you should store in varchar. But I would breakup your data - will make your search quesries easier. Dealing with full-text search ( which you will have to do if you store everything in one field) is pretty difficult compared to queries you would have to do if you break up your data.

Avis answered 30/11, 2009 at 22:40 Comment(0)
G
0

I'd suggest storing your data as intervals, ie <3% would become [0,3), 3% would become [3,3], and so on. This might take 4 columns in your database, one for each of the endpoints and one to denote whether the interval is open or closed at each end point. Keep the numeric data numeric so you can do arithmetic on it, encode the values 'open' and 'closed' in the way that best suits your access and manipulation requirements.

Guck answered 1/12, 2009 at 8:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.