docketanalyzer
Docket Management
Pacer
Utility for downloading PACER data.
Convenience wrapper around Free Law Project's juriscraper for downloading dockets and documents from PACER.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pacer_username
|
str
|
PACER account username. If not provided, will use saved config or PACER_USERNAME from environment. |
None
|
pacer_password
|
str
|
PACER account password. If not provided, will use saved config or PACER_PASSWORD from environment. |
None
|
Attributes:
Name | Type | Description |
---|---|---|
pacer_username |
str
|
The PACER account username |
pacer_password |
str
|
The PACER account password |
cache |
dict
|
Internal cache for storing session and driver instances |
Source code in docketanalyzer/pacer/pacer.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 |
|
purchase_docket(docket_id, **kwargs)
Purchases a docket for a given docket ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
docket_id
|
str
|
The docket ID to purchase. |
required |
**kwargs
|
Any
|
Additional query arguments to pass to juriscraper. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
tuple |
tuple[str, dict]
|
A tuple containing the raw HTML and the parsed docket JSON. |
Source code in docketanalyzer/pacer/pacer.py
purchase_document(pacer_case_id, pacer_doc_id, court)
Purchases a document for a given PACER case ID and document ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pacer_case_id
|
str
|
The PACER case ID to purchase the document from. |
required |
pacer_doc_id
|
str
|
The PACER document ID to purchase. |
required |
court
|
str
|
The court to purchase the document from. |
required |
Returns:
Name | Type | Description |
---|---|---|
tuple |
tuple[bytes, str]
|
A tuple containing the PDF content and the status of the purchase. |
Source code in docketanalyzer/pacer/pacer.py
purchase_attachment(pacer_case_id, pacer_doc_id, attachment_number, court)
Purchases an attachment for a given PACER case ID and document ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pacer_case_id
|
str
|
The PACER case ID to purchase the attachment from. |
required |
pacer_doc_id
|
str
|
The PACER document ID to purchase the attachment from. |
required |
attachment_number
|
str
|
The attachment number to purchase. |
required |
court
|
str
|
The court to purchase the attachment from. |
required |
Returns:
Name | Type | Description |
---|---|---|
tuple |
tuple[bytes, str]
|
A tuple containing the PDF content and the status of the purchase. |
Source code in docketanalyzer/pacer/pacer.py
parse(docket_html, court)
Parses the raw HTML of a docket and returns the parsed docket JSON.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
docket_html
|
str
|
The raw HTML of the docket. |
required |
court
|
str
|
The court to parse the docket from. |
required |
Returns:
Name | Type | Description |
---|---|---|
dict |
dict
|
The parsed docket JSON. |
Source code in docketanalyzer/pacer/pacer.py
find_candidate_cases(docket_id)
Finds candidate PACER cases for a given docket ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
docket_id
|
str
|
The docket ID to search for. |
required |
Returns:
Name | Type | Description |
---|---|---|
list |
list[dict[str, str]]
|
A list of candidate cases. |
Source code in docketanalyzer/pacer/pacer.py
Services
services
Database
A PostgreSQL database manager that provides high-level database operations.
This class handles database connections, table management, model registration, and provides an interface for table operations with schemaless tables through the Tables class.
Source code in docketanalyzer/services/psql.py
406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 |
|
meta
property
Get database metadata including table and column information.
Returns:
Name | Type | Description |
---|---|---|
dict |
dict[str, dict[str, Any]]
|
Database metadata including table schemas and foreign keys |
__init__(connection=None, registered_models=None)
Initialize the database manager.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
connection
|
str
|
PostgreSQL connection URL |
None
|
registered_models
|
list
|
List of model classes to register with the database |
None
|
Source code in docketanalyzer/services/psql.py
connect()
Establish connection to the PostgreSQL database using the connection URL.
Source code in docketanalyzer/services/psql.py
status()
Check if the database connection is working.
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if connection is successful, False otherwise |
reload()
Reload the database metadata and registered models.
register_model(model)
Register a model class with the database manager.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
type[DatabaseModel]
|
Peewee model class to register |
required |
Source code in docketanalyzer/services/psql.py
load_table_class(name, new=False)
Dynamically create a model class for a database table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the table |
required |
new
|
bool
|
Whether this is a new table being created |
False
|
Returns:
Name | Type | Description |
---|---|---|
type |
type[DatabaseModel]
|
A new DatabaseModel subclass representing the table |
Raises:
Type | Description |
---|---|
KeyError
|
If table doesn't exist and new=False |
Source code in docketanalyzer/services/psql.py
create_table(name_or_model, exists_ok=True)
Create a new table in the database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name_or_model
|
Union[str, Type[DatabaseModel]]
|
Name of the table to create or model class |
required |
exists_ok
|
bool
|
Whether to silently continue if table exists |
True
|
Raises:
Type | Description |
---|---|
ValueError
|
If table exists and exists_ok=False |
Source code in docketanalyzer/services/psql.py
drop_table(name, confirm=True)
Drop a table from the database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the table to drop |
required |
confirm
|
bool
|
Whether to prompt for confirmation before dropping |
True
|
Raises:
Type | Description |
---|---|
Exception
|
If confirmation is required and user does not confirm |
Source code in docketanalyzer/services/psql.py
DatabaseModel
Bases: DatabaseModelQueryMixin
, Model
A base model class that extends Peewee's Model with additional functionality.
This class provides enhanced database operations including pandas DataFrame conversion, batch processing, column management, and model reloading capabilities.
Source code in docketanalyzer/services/psql.py
212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 |
|
drop_column(column_name, confirm=True)
classmethod
Drop a column from the database table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_name
|
str
|
Name of the column to drop |
required |
confirm
|
bool
|
Whether to prompt for confirmation before dropping |
True
|
Source code in docketanalyzer/services/psql.py
add_column(column_name, column_type, null=True, overwrite=False, exists_ok=True, **kwargs)
classmethod
Add a new column to the database table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_name
|
str
|
Name of the new column |
required |
column_type
|
str
|
Peewee field type for the column |
required |
null
|
bool
|
Whether the column can contain NULL values |
True
|
overwrite
|
bool
|
Whether to overwrite if column exists |
False
|
exists_ok
|
bool
|
Whether to silently continue if column exists |
True
|
**kwargs
|
Any
|
Additional field parameters passed to Peewee |
{}
|
Source code in docketanalyzer/services/psql.py
add_data(data, copy=False, batch_size=1000)
classmethod
Add data to the table from a pandas DataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
DataFrame containing the data to insert |
required |
copy
|
bool
|
Whether to use Postgres COPY command for faster insertion |
False
|
batch_size
|
int
|
Number of records to insert in each batch when not using COPY |
1000
|
Source code in docketanalyzer/services/psql.py
reload()
classmethod
Reload the model class to reflect any changes in the database schema.
Source code in docketanalyzer/services/psql.py
S3
A class for syncing local data with an S3 bucket.
Attributes:
Name | Type | Description |
---|---|---|
data_dir |
Path
|
Local directory for data storage. |
bucket |
Path
|
S3 bucket name. |
endpoint_url |
Optional[str]
|
Custom S3 endpoint URL. |
client |
client
|
Boto3 S3 client for direct API interactions. |
Source code in docketanalyzer/services/s3.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 |
|
__init__(data_dir=None)
Initialize the S3 service.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_dir
|
Optional[str]
|
Path to local data directory. If None, uses env.DATA_DIR. |
None
|
Source code in docketanalyzer/services/s3.py
push(path=None, from_path=None, to_path=None, **kwargs)
Push data from local storage to S3.
Syncs files from a local directory to an S3 bucket path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
Optional[Union[str, Path]]
|
If provided, used as both from_path and to_path. |
None
|
from_path
|
Optional[Union[str, Path]]
|
Local source path to sync from. |
None
|
to_path
|
Optional[Union[str, Path]]
|
S3 destination path to sync to. |
None
|
**kwargs
|
Any
|
Additional arguments to pass to the AWS CLI s3 sync command. |
{}
|
Source code in docketanalyzer/services/s3.py
pull(path=None, from_path=None, to_path=None, **kwargs)
Pull data from S3 to local storage.
Syncs files from an S3 bucket path to a local directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
Optional[Union[str, Path]]
|
If provided, used as both from_path and to_path. |
None
|
from_path
|
Optional[Union[str, Path]]
|
S3 source path to sync from. |
None
|
to_path
|
Optional[Union[str, Path]]
|
Local destination path to sync to. |
None
|
**kwargs
|
Any
|
Additional arguments to pass to the AWS CLI s3 sync command. |
{}
|
Source code in docketanalyzer/services/s3.py
download(s3_key, local_path=None)
Download a single file from S3 using the boto3 client.
This method downloads a specific file from S3 to a local path. If local_path is not provided, it will mirror the S3 path structure in the data directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
s3_key
|
str
|
The key of the file in the S3 bucket. |
required |
local_path
|
Optional[Union[str, Path]]
|
The local path to save the file to. If None, the file will be saved to data_dir/s3_key. |
None
|
Returns:
Name | Type | Description |
---|---|---|
Path |
Path
|
The path to the downloaded file. |
Raises:
Type | Description |
---|---|
ClientError
|
If the download fails. |
Source code in docketanalyzer/services/s3.py
upload(local_path, s3_key=None)
Upload a single file to S3 using the boto3 client.
This method uploads a specific file from a local path to S3. If s3_key is not provided, it will use the relative path from data_dir as the S3 key.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
local_path
|
Union[str, Path]
|
The local path of the file to upload. |
required |
s3_key
|
Optional[str]
|
The key to use in the S3 bucket. If None, the relative path from data_dir will be used. |
None
|
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The S3 key of the uploaded file. |
Raises:
Type | Description |
---|---|
FileNotFoundError
|
If the local file does not exist. |
ClientError
|
If the upload fails. |
Source code in docketanalyzer/services/s3.py
delete(s3_key)
Delete a single file from S3 using the boto3 client.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
s3_key
|
str
|
The key of the file in the S3 bucket to delete. |
required |
Raises:
Type | Description |
---|---|
ClientError
|
If the deletion fails. |
Source code in docketanalyzer/services/s3.py
load_elastic(**kwargs)
Load an Elasticsearch client with the configured connection URL.
Run da configure elastic
to set the connection URL.
load_psql()
Load a Database object using the connection url in your config.
Run da configure postgres
to set your PostgreSQL connection URL.
load_redis(**kwargs)
Load a Redis client with the configured connection URL.
Run da configure elastic
to set the connection URL.
load_s3(data_dir=None)
Load the S3 service.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_dir
|
Optional[Union[str, Path]]
|
Path to local data directory. If None, uses env.DATA_DIR. |
None
|
Returns:
Name | Type | Description |
---|---|---|
S3 |
S3
|
An instance of the S3 class. |
Source code in docketanalyzer/services/s3.py
load_psql()
Load a Database object using the connection url in your config.
Run da configure postgres
to set your PostgreSQL connection URL.
load_redis(**kwargs)
Load a Redis client with the configured connection URL.
Run da configure elastic
to set the connection URL.
load_s3(data_dir=None)
Load the S3 service.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_dir
|
Optional[Union[str, Path]]
|
Path to local data directory. If None, uses env.DATA_DIR. |
None
|
Returns:
Name | Type | Description |
---|---|---|
S3 |
S3
|
An instance of the S3 class. |
Source code in docketanalyzer/services/s3.py
Database
A PostgreSQL database manager that provides high-level database operations.
This class handles database connections, table management, model registration, and provides an interface for table operations with schemaless tables through the Tables class.
Source code in docketanalyzer/services/psql.py
406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 |
|
__init__(connection=None, registered_models=None)
Initialize the database manager.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
connection
|
str
|
PostgreSQL connection URL |
None
|
registered_models
|
list
|
List of model classes to register with the database |
None
|
Source code in docketanalyzer/services/psql.py
connect()
Establish connection to the PostgreSQL database using the connection URL.
Source code in docketanalyzer/services/psql.py
create_table(name_or_model, exists_ok=True)
Create a new table in the database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name_or_model
|
Union[str, Type[DatabaseModel]]
|
Name of the table to create or model class |
required |
exists_ok
|
bool
|
Whether to silently continue if table exists |
True
|
Raises:
Type | Description |
---|---|
ValueError
|
If table exists and exists_ok=False |
Source code in docketanalyzer/services/psql.py
register_model(model)
Register a model class with the database manager.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
type[DatabaseModel]
|
Peewee model class to register |
required |
Source code in docketanalyzer/services/psql.py
DatabaseModel
Bases: DatabaseModelQueryMixin
, Model
A base model class that extends Peewee's Model with additional functionality.
This class provides enhanced database operations including pandas DataFrame conversion, batch processing, column management, and model reloading capabilities.
Source code in docketanalyzer/services/psql.py
212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 |
|
add_column(column_name, column_type, null=True, overwrite=False, exists_ok=True, **kwargs)
classmethod
Add a new column to the database table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_name
|
str
|
Name of the new column |
required |
column_type
|
str
|
Peewee field type for the column |
required |
null
|
bool
|
Whether the column can contain NULL values |
True
|
overwrite
|
bool
|
Whether to overwrite if column exists |
False
|
exists_ok
|
bool
|
Whether to silently continue if column exists |
True
|
**kwargs
|
Any
|
Additional field parameters passed to Peewee |
{}
|
Source code in docketanalyzer/services/psql.py
add_data(data, copy=False, batch_size=1000)
classmethod
Add data to the table from a pandas DataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
DataFrame containing the data to insert |
required |
copy
|
bool
|
Whether to use Postgres COPY command for faster insertion |
False
|
batch_size
|
int
|
Number of records to insert in each batch when not using COPY |
1000
|
Source code in docketanalyzer/services/psql.py
drop_column(column_name, confirm=True)
classmethod
Drop a column from the database table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_name
|
str
|
Name of the column to drop |
required |
confirm
|
bool
|
Whether to prompt for confirmation before dropping |
True
|
Source code in docketanalyzer/services/psql.py
reload()
classmethod
Reload the model class to reflect any changes in the database schema.
Source code in docketanalyzer/services/psql.py
S3
A class for syncing local data with an S3 bucket.
Attributes:
Name | Type | Description |
---|---|---|
data_dir |
Path
|
Local directory for data storage. |
bucket |
Path
|
S3 bucket name. |
endpoint_url |
Optional[str]
|
Custom S3 endpoint URL. |
client |
client
|
Boto3 S3 client for direct API interactions. |
Source code in docketanalyzer/services/s3.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 |
|
__init__(data_dir=None)
Initialize the S3 service.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_dir
|
Optional[str]
|
Path to local data directory. If None, uses env.DATA_DIR. |
None
|
Source code in docketanalyzer/services/s3.py
push(path=None, from_path=None, to_path=None, **kwargs)
Push data from local storage to S3.
Syncs files from a local directory to an S3 bucket path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
Optional[Union[str, Path]]
|
If provided, used as both from_path and to_path. |
None
|
from_path
|
Optional[Union[str, Path]]
|
Local source path to sync from. |
None
|
to_path
|
Optional[Union[str, Path]]
|
S3 destination path to sync to. |
None
|
**kwargs
|
Any
|
Additional arguments to pass to the AWS CLI s3 sync command. |
{}
|
Source code in docketanalyzer/services/s3.py
pull(path=None, from_path=None, to_path=None, **kwargs)
Pull data from S3 to local storage.
Syncs files from an S3 bucket path to a local directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
Optional[Union[str, Path]]
|
If provided, used as both from_path and to_path. |
None
|
from_path
|
Optional[Union[str, Path]]
|
S3 source path to sync from. |
None
|
to_path
|
Optional[Union[str, Path]]
|
Local destination path to sync to. |
None
|
**kwargs
|
Any
|
Additional arguments to pass to the AWS CLI s3 sync command. |
{}
|
Source code in docketanalyzer/services/s3.py
upload(local_path, s3_key=None)
Upload a single file to S3 using the boto3 client.
This method uploads a specific file from a local path to S3. If s3_key is not provided, it will use the relative path from data_dir as the S3 key.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
local_path
|
Union[str, Path]
|
The local path of the file to upload. |
required |
s3_key
|
Optional[str]
|
The key to use in the S3 bucket. If None, the relative path from data_dir will be used. |
None
|
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The S3 key of the uploaded file. |
Raises:
Type | Description |
---|---|
FileNotFoundError
|
If the local file does not exist. |
ClientError
|
If the upload fails. |
Source code in docketanalyzer/services/s3.py
download(s3_key, local_path=None)
Download a single file from S3 using the boto3 client.
This method downloads a specific file from S3 to a local path. If local_path is not provided, it will mirror the S3 path structure in the data directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
s3_key
|
str
|
The key of the file in the S3 bucket. |
required |
local_path
|
Optional[Union[str, Path]]
|
The local path to save the file to. If None, the file will be saved to data_dir/s3_key. |
None
|
Returns:
Name | Type | Description |
---|---|---|
Path |
Path
|
The path to the downloaded file. |
Raises:
Type | Description |
---|---|
ClientError
|
If the download fails. |
Source code in docketanalyzer/services/s3.py
delete(s3_key)
Delete a single file from S3 using the boto3 client.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
s3_key
|
str
|
The key of the file in the S3 bucket to delete. |
required |
Raises:
Type | Description |
---|---|
ClientError
|
If the deletion fails. |
Source code in docketanalyzer/services/s3.py
utils
extension_required
Context manager extension imports.
Source code in docketanalyzer/utils/utils.py
__init__(extension)
__enter__()
__exit__(exc_type, exc_val, exc_tb)
Handle import errors with helpful messages.
Source code in docketanalyzer/utils/utils.py
timeit
Context manager for timing things.
Usage: with timeit("Task"): # do something do_something()
This will print the time taken to execute the block of code.
Source code in docketanalyzer/utils/utils.py
__init__(description='Task')
__enter__()
__exit__(exc_type, exc_val, exc_tb)
Print the execution time.
parse_docket_id(docket_id)
Parse a docket ID into a court and docket number.
construct_docket_id(court, docket_number)
Construct a docket ID from a court and docket number.
json_default(obj)
Default JSON serializer for datetime and date objects.
notabs(text)
download_file(url, path, description='Downloading')
Download file from URL to local path with progress bar.
Source code in docketanalyzer/utils/utils.py
generate_hash(data, salt=None, length=None)
Generate a hash for some data with optional salt.
Source code in docketanalyzer/utils/utils.py
generate_code(length=16)
Generate a random code of specified length.
pd_save_or_append(data, path, **kwargs)
Save or append a DataFrame to a CSV file.
Source code in docketanalyzer/utils/utils.py
datetime_utcnow()
list_to_array(data)
to_date(value)
Convert a value to a date if possible.