Python – How to get first value in groupby , TypeError: first() missing 1 required positional argument: ‘offset’
dataframe
group-by
pandas
python
trying to get most occured counts for each status
my code:
df.groupby('States')['Counts'].value_counts().first(), gives
TypeError: first() missing 1 required positional argument: 'offset'
expected output:
States Counts
AK one
LO three
Use lambda
function:
df = df.groupby('States')['Counts'].apply(lambda x: x.value_counts().index[0]).reset_index(name='val')
print (df)
States val
0 AK one
1 LO three
The function ord()
gets the int value
of the char. And in case you want to
convert back after playing with the
number, function chr()
does the trick.
>>> ord('a')
>>> chr(97)
>>> chr(ord('a') + 3)
In Python 2, there was also the unichr
function, returning the Unicode character whose ordinal is the unichr
argument:
>>> unichr(97)
>>> unichr(1234)
u'\u04d2'
In Python 3 you can use chr
instead of unichr
.
ord() - Python 3.6.5rc1 documentation
ord() - Python 2.7.14 documentation
;WITH cte AS
SELECT *,
ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC) AS rn
FROM DocumentStatusLogs
SELECT *
FROM cte
WHERE rn = 1
If you expect 2 entries per day, then this will arbitrarily pick one. To get both entries for a day, use DENSE_RANK instead
As for normalised or not, it depends if you want to:
maintain status in 2 places
preserve status history
As it stands, you preserve status history. If you want latest status in the parent table too (which is denormalisation) you'd need a trigger to maintain "status" in the parent. or drop this status history table.
Related Question