Gene regulation

class spaTrack.Trainer(data_type: Literal['2_time', 'p_time'], expression_matrix_path: Union[str, List[str]], tfs_path: str, cell_mapping_path: Optional[str] = None, ptime_path: Optional[str] = None, min_cells: Optional[Union[int, List[int]]] = None, cell_divide_per_time: int = 80, cell_select_per_time: int = 10, cell_generate_per_time: int = 500, train_ratio: float = 0.8, use_gpu: bool = True, random_state: int = 0)[source]

Class for implementing the training process.

Parameters:
  • type – The Data type. Including dual time point data of two slices and pseudotime data of one slice.

  • expression_matrix_path – The path of the expression matrix file.

  • tfs_path – The path of the tf names file.

  • cell_mapping_path – The path of the cell mapping file, where column slice1 indicates the start cell and column slice2 indicates the end cell.

  • ptime_path – The path of the ptime file, used to determine the sequence of the ptime data.

  • min_cells – The minimum number of cells for gene filtration.

  • optional – The minimum number of cells for gene filtration.

  • cell_divide_per_time – The cell number generated at each time point using the meta-analysis method, by default 500.

  • optional – The cell number generated at each time point using the meta-analysis method, by default 500.

  • cell_select_per_time – The number of randomly selected cells at each time point.

  • optional – The number of randomly selected cells at each time point.

  • cell_generate_per_time – The number of cells generated at each time point.

  • optional – The number of cells generated at each time point.

  • train_ratio – Ratio of training data.

  • use_gpu – Whether to use gpu, by default True.

  • optional – Whether to use gpu, by default True.

  • random_state – Random seed of numpy and torch.

plot_scatter(num_rows: int = 3, num_cols: int = 3, fig_width: int = 10, fig_height: int = 9.5) None[source]

Show the relationship between TF and gene changes through scatter plot.

Parameters:
  • num_rows – The number of rows in the graph. (Default: 3)

  • num_cols – The number of columns in the graph. (Default: 3)

  • fig_width – The width of the image. (Default: 10)

  • fig_height – The height of the image. (Default: 9.5)

run(training_times: int = 10, iter_times: int = 30, mapping_num: int = 3000, filename: str = 'weights.csv', lr_ratio: float = 0.1) None[source]

Run the trainer.

Parameters:
  • training_times – Number of times to randomly initialize the model and retrain. (Default: 10)

  • iter_times – The number of iterations for each training model, by default 30. (Default: 30)

  • mapping_num – The number of top weight pairs you want to extract. (Default: 3000)

  • filename – The saved file name. (Default: ‘weights.csv’)