Tutorial of PageWiseAggregation

PageWiseAggregation class has the aggregated log as the member variable and has the method functions to get the information.
../_images/pagewise_aggregation.png
import openLA as la

course_info, event_stream = la.start_analysis(files_dir="dataset_sample", course_id="A")

operation_count = la.convert_into_page_wise(event_stream,
                                            invalid_seconds=10,
                                            timeout_seconds=30*60,
                                            user_id=event_stream.user_id()[:10],
                                            contents_id=event_stream.contents_id()[0],
                                            operation_name=operations,
                                            count_operation=True
                                            )
The method functions in below table are available now.
You can get the information of the log like pagewise_aggregation.num_users().
The documentation is in Page-wise Aggregation Document

function

description

num_users

Get the number of users in the log

user_id

Get the unique user ids in the log

contents_id

Get the unique contents ids in the log

operation_name

Get the unique operation name in the log

operation_count

Get the count of each (or specified) operation in the log

reading_seconds

Get the reading seconds in the log

reading_time

Get the reading time (seconds, minutes, or hours) in the log

num_unique_pages

Get the number of unique pages in the log

unique_pages

Get the unique page numbers in the log

If you want to process other than the above functions, you can get DataFrame type event stream by pagewise_aggregation.df and process with Pandas library.

import OpenLA as la
import pandas as pd

operations = ["NEXT", "PREV", "ADD MARKER"]

course_info, event_stream = la.start_analysis(files_dir="dataset_sample", course_id="A")

pagewise_aggregation = la.convert_into_page_wise(event_stream,
                                                 invalid_seconds=10,
                                                 timeout_seconds=30*60,
                                                 user_id=event_stream.user_id()[:10],
                                                 contents_id=event_stream.contents_id()[0],
                                                 operation_name=operations,
                                                 count_operation=True
                                                 )

pagewise_df = pagewise_aggregation.df
print(pagewise_df)
"""
    userid contentsid  pageno  reading_seconds  NEXT  PREV  ADD MARKER
0       U1         C1       1             1233     7     0          0
1       U1         C1       2               80     2     0          2
2       U1         C1       3              374     1     0          0
3       U1         C1       4               46     1     0          0
...     ...       ...       ...            ...    ...   ...        ...
"""

Example