Page-wise Aggregation Document

class OpenLA.data_classes.pagewise_aggregation.PageWiseAggregation(df)[source]

Bases: object

property df
num_users()[source]

Get the number of users in the Dataframe

Returns

The number of users in the Dataframe

Return type

int

user_id()[source]

Get the unique user ids in the Dataframe

Returns

One-dimensional array of user ids in the event_stream

Return type

List[str]

contents_id()[source]

Get the unique contents ids in the Dataframe

Returns

One-dimensional array of contents ids in the Dataframe

Return type

List[str]

operation_name()[source]

Get the unique operations in the Dataframe

Returns

One-dimensional array of operation names in the Dataframe

Return type

List[str]

operation_count(operation_name=None, user_id=None, contents_id=None)[source]

Get the count of each operations in the Dataframe

Parameters
  • user_id (str or None) – The user to count operation. If it is None, the total count of all users is returned.

  • contents_id (str or None) – The contents to count operation. If it is None, the total count in all contents is returned.

  • operation_name (str or None) – The name of operation to count

Returns

If “operation_name” is None, return dictionary of the number of each operation in the Dataframe. (Key: operation name, Value: The count of the operation)

else if “operation_name” is indicated, return the count of the operation

Return type

dict or int

reading_seconds(user_id=None, contents_id=None)[source]

Get the total reading seconds.

If “user_id” is indicated, the reading seconds is calculated for the users. Else, it is calculated for all users in the Dataframe.

If “contents_id” is indicated, the reading seconds is calculated for the contents. Else, it is calculated for all contents in the Dataframe.

Parameters
  • user_id (str or List[str]) – User(s) to aggregate the reading seconds

  • contents_id (str or List[str]) – Content(s) to aggregate the reading seconds

Returns

The total reading seconds.

Return type

int

reading_time(time_unit='seconds', user_id=None, contents_id=None)[source]

Get the total reading time. You can indicate the time unit from ‘seconds’, ‘minutes’, or ‘hours’

If “user_id” is indicated, the reading seconds is calculated for the users. Else, it is calculated for all users in the Dataframe.

If “contents_id” is indicated, the reading seconds is calculated for the contents. Else, it is calculated for all contents in the Dataframe.

Parameters
  • time_unit (str) – Time unit of reading time to return. Select from ‘seconds’, ‘minutes’, or ‘hours’

  • user_id (str or List[str]) – User(s) to aggregate the reading seconds

  • contents_id (str or List[str]) – Content(s) to aggregate the reading seconds

Returns

The total reading time.

Return type

int

num_unique_pages(user_id=None, contents_id=None)[source]

Get the unique number of pages

Returns

The unique number of pages

Return type

int

unique_pages(user_id=None, contents_id=None)[source]

Get the unique number of pages

Returns

The unique number of pages

Return type

int

to_csv(save_file)[source]
class OpenLA.data_classes.pagewise_aggregation.PageTransition(df)[source]

Bases: OpenLA.data_classes.pagewise_aggregation.PageWiseAggregation

num_transition(user_id=None, contents_id=None)[source]

Get the number of page transition.

Returns

The number of page transition. In other words, the number of reading pages including duplication.

Return type

int

operation_name()[source]

Get the unique operations in the Dataframe

Returns

One-dimensional array of operation names in the Dataframe

Return type

List[str]

operation_count(operation_name=None, user_id=None, contents_id=None)[source]

Get the count of each operations in the Dataframe

Parameters
  • user_id (str or None) – The user to count operation. If it is None, the total count of all users is returned.

  • contents_id (str or None) – The contents to count operation. If it is None, the total count in all contents is returned.

  • operation_name (str or None) – The name of operation to count

Returns

If “operation_name” is None, return dictionary of the number of each operation in the Dataframe. (Key: operation name, Value: The count of the operation)

else if “operation_name” is indicated, return the count of the operation

Return type

dict or int