6 items found

Groups: sobigdata-it Tags: Web mining

Filter Results
  • Access required...

    ×

    Method

    Private Cybersecurity NER BERT-base-cased model

    This method includes a Python script and files of a BERT-base-cased model fine-tuned on our Cybersecurity NER dataset. The method requires as input a list of sentences that...
  • Access required...

    ×

    Dataset

    Private Cybersecurity NER dataset

    Our dataset is created by merging APTNER and CyNER datasets, containing 13601 sentences, 347779 tokens, and 37684 entities. The split ratio was roughly 70% for training and...
  • Method

    Cybersecurity NER SecureBERT model

    This method includes a Python script and files of a SecureBERT model fine-tuned on our Cybersecurity NER dataset. The method requires as input a list of sentences that will be...
    • JSON
      The resource: 'config' is not accessible as guest user. You must login to access it!
    • TXT
      The resource: 'merges' is not accessible as guest user. You must login to access it!
    • BIN
      The resource: 'model' is not accessible as guest user. You must login to access it!
    • JSON
      The resource: 'model_args' is not accessible as guest user. You must login to access it!
    • ZIP
      The resource: 'optimizer' is not accessible as guest user. You must login to access it!
    • ZIP
      The resource: 'scheduler' is not accessible as guest user. You must login to access it!
    • JSON
      The resource: 'special_tokens_map' is not accessible as guest user. You must login to access it!
    • JSON
      The resource: 'tokenizer' is not accessible as guest user. You must login to access it!
    • JSON
      The resource: 'tokenizer_config' is not accessible as guest user. You must login to access it!
    • ZIP
      The resource: 'training_args' is not accessible as guest user. You must login to access it!
    • TXT
      The resource: 'vocab' is not accessible as guest user. You must login to access it!
    • text/x-python
      The resource: 'inference' is not accessible as guest user. You must login to access it!
  • Method

    Cybersecurity NER RoBERTa-base model

    This method includes a Python script and files of a RoBERTa-base model fine-tuned on our Cybersecurity NER dataset. The method requires as input a list of sentences that will...
    • JSON
      The resource: 'config' is not accessible as guest user. You must login to access it!
    • TXT
      The resource: 'merges' is not accessible as guest user. You must login to access it!
    • BIN
      The resource: 'model' is not accessible as guest user. You must login to access it!
    • JSON
      The resource: 'model_args' is not accessible as guest user. You must login to access it!
    • ZIP
      The resource: 'scheduler' is not accessible as guest user. You must login to access it!
    • JSON
      The resource: 'special_tokens_map' is not accessible as guest user. You must login to access it!
    • JSON
      The resource: 'tokenizer_config' is not accessible as guest user. You must login to access it!
    • ZIP
      The resource: 'training_args' is not accessible as guest user. You must login to access it!
    • JSON
      The resource: 'tokenizer' is not accessible as guest user. You must login to access it!
    • JSON
      The resource: 'vocab' is not accessible as guest user. You must login to access it!
    • ZIP
      The resource: 'optimizer' is not accessible as guest user. You must login to access it!
    • py
      The resource: 'inference' is not accessible as guest user. You must login to access it!
  • Dataset

    SWH Filenames

    A 69 GB dataset with ~2.3 billion strings representing deduplicated names of source code files collected by Software Heritage, the great library of source code...
    • ZIP
      The resource: 'SWH Filenames' is not accessible as guest user. You must login to access it!
  • Dataset

    FAIR-SWENG: dataset on gender fairness in software engineering academic lands...

    The dataset contains academic performance metrics of Software Engineers worldwide.